INTER-UNIVERSAL TEICHMÜLLER THEORY IV: LOG-VOLUME COMPUTATIONS AND SET-THEORETIC FOUNDATIONS Shinichi Mochizuki April 2020 Abstract. The present paper forms the fourth and final paper in a series of papers concerning “inter-universal Teichmüller theory”. In the first three papers of the series, we introduced and studied the theory surrounding the log- theta-lattice, a highly non-commutative two-dimensional diagram of “miniature models of conventional scheme theory”, called Θ ±ell NF-Hodge theaters, that were associated, in the first paper of the series, to certain data, called initial Θ-data. This data includes an elliptic curve E F over a number field F , together with a prime number l 5. Consideration of various properties of the log-theta-lattice led naturally to the establishment, in the third paper of the series, of multiradial algorithms for constructing “splitting monoids of LGP-monoids”. Here, we recall that “multiradial algorithms” are algorithms that make sense from the point of view of an “alien arithmetic holomorphic structure”, i.e., the ring/scheme structure of a Θ ±ell NF-Hodge theater related to a given Θ ±ell NF-Hodge theater by means of a non-ring/scheme-theoretic horizontal arrow of the log-theta-lattice. In the present paper, estimates arising from these multiradial algorithms for splitting monoids of LGP-monoids are applied to verify various diophantine results which imply, for instance, the so-called Vojta Conjecture for hyperbolic curves, the ABC Conjecture, and the Szpiro Conjecture for elliptic curves. Finally, we examine albeit from an extremely naive/non-expert point of view! the foundational/set- theoretic issues surrounding the vertical and horizontal arrows of the log-theta-lattice by introducing and studying the basic properties of the notion of a “species”, which may be thought of as a sort of formalization, via set-theoretic formulas, of the intuitive notion of a “type of mathematical object”. These foundational issues are closely related to the central role played in the present series of papers by various results from absolute anabelian geometry, as well as to the idea of gluing together distinct models of conventional scheme theory, i.e., in a fashion that lies outside the framework of conventional scheme theory. Moreover, it is precisely these foundational issues surrounding the vertical and horizontal arrows of the log-theta-lattice that led naturally to the introduction of the term “inter-universal”. Contents: Introduction §0. Notations and Conventions §1. Log-volume Estimates §2. Diophantine Inequalities §3. Inter-universal Formalism: the Language of Species Typeset by AMS-TEX 1 2 SHINICHI MOCHIZUKI Introduction The present paper forms the fourth and final paper in a series of papers concern- ing “inter-universal Teichmüller theory”. In the first three papers, [IUTchI], [IUTchII], and [IUTchIII], of the series, we introduced and studied the theory sur- rounding the log-theta-lattice [cf. the discussion of [IUTchIII], Introduction], a highly non-commutative two-dimensional diagram of “miniature models of con- ventional scheme theory”, called Θ ±ell NF-Hodge theaters, that were associated, in the first paper [IUTchI] of the series, to certain data, called initial Θ-data. This data includes an elliptic curve E F over a number field F , together with a prime number l 5 [cf. [IUTchI], §I1]. Consideration of various properties of the log- theta-lattice leads naturally to the establishment of multiradial algorithms for constructing “splitting monoids of LGP-monoids” [cf. [IUTchIII], Theorem A]. Here, we recall that “multiradial algorithms” [cf. the discussion of the Intro- ductions to [IUTchII], [IUTchIII]] are algorithms that make sense from the point of view of an “alien arithmetic holomorphic structure”, i.e., the ring/scheme structure of a Θ ±ell NF-Hodge theater related to a given Θ ±ell NF-Hodge theater by means of a non-ring/scheme-theoretic horizontal arrow of the log-theta-lattice. In the final portion of [IUTchIII], by applying these multiradial algorithms for split- ting monoids of LGP-monoids, we obtained estimates for the log-volume of these LGP-monoids [cf. [IUTchIII], Theorem B]. In the present paper, these estimates will be applied to verify various diophantine results. In §1 of the present paper, we start by discussing various elementary estimates for the log-volume of various tensor products of the modules obtained by applying the p-adic logarithm to the local units i.e., in the terminology of [IUTchIII], “tensor packets of log-shells” [cf. the discussion of [IUTchIII], Introduction] in terms of various well-known invariants, such as differents, associated to a mixed- characteristic nonarchimedean local field [cf. Propositions 1.1, 1.2, 1.3, 1.4]. We then discuss similar but technically much simpler! log-volume estimates in the case of complex archimedean local fields [cf. Proposition 1.5]. After review- ing a certain classical estimate concerning the distribution of prime numbers [cf. Proposition 1.6], as well as some elementary general nonsense concerning weighted averages [cf. Proposition 1.7] and well-known elementary facts concerning elliptic curves [cf. Proposition 1.8], we then proceed to compute explicitly, in more elemen- tary language, the quantity that was estimated in [IUTchIII], Theorem B. These computations yield a quite strong/explicit diophantine inequality [cf. Theorem 1.10] concerning elliptic curves that are in “sufficiently general position”, so that one may apply the general theory developed in the first three papers of the series. In §2 of the present paper, after reviewing another classical estimate concern- ing the distribution of prime numbers [cf. Proposition 2.1, (ii)], we then proceed to apply the theory of [GenEll] to reduce various diophantine results concerning an arbitrary elliptic curve over a number field to results of the type obtained in Theorem 1.10 concerning elliptic curves that are in “sufficiently general posi- tion” [cf. Corollary 2.2]. This reduction allows us to derive the following result [cf. Corollary 2.3], which constitutes the main application of the “inter-universal Teichmüller theory” developed in the present series of papers. INTER-UNIVERSAL TEICHMÜLLER THEORY IV Theorem A. 3 (Diophantine Inequalities) Let X be a smooth, proper, geomet- def rically connected curve over a number field; D X a reduced divisor; U X = X\D; d a positive integer;  R >0 a positive real number. Write ω X for the canon- ical sheaf on X. Suppose that U X is a hyperbolic curve, i.e., that the degree of the line bundle ω X (D) is positive. Then, relative to the notation of [GenEll] [reviewed in the discussion preceding Corollary 2.2 of the present paper], one has an inequality of “bounded discrepancy classes” ht ω X (D)  (1 + )(log-diff X + log-cond D ) of functions on U X (Q) ≤d i.e., the function (1 + )(log-diff X + log-cond D ) ht ω X (D) is bounded below by a constant on U X (Q) ≤d [cf. [GenEll], Definition 1.2, (ii), as well as Remark 2.3.1, (ii), of the present paper]. Thus, Theorem A asserts an inequality concerning the canonical height [i.e., “ht ω X (D) ”], the logarithmic different [i.e., “log-diff X ”], and the logarithmic conduc- tor [i.e., “log-cond D ”] of points of the curve U X valued in number fields whose extension degree over Q is d . In particular, the so-called Vojta Conjecture for hyperbolic curves, the ABC Conjecture, and the Szpiro Conjecture for elliptic curves all follow as special cases of Theorem A. We refer to [Vjt] for a detailed exposition of these conjectures. Finally, in §3, we examine albeit from an extremely naive/non-expert point of view! certain foundational issues underlying the theory of the present se- ries of papers. Typically in mathematical discussions [i.e., by mathematicians who are not equipped with a detailed knowledge of the theory of foundations!] such as, for instance, the theory developed in the present series of papers! one de- fines various “types of mathematical objects” [i.e., such as groups, topological spaces, or schemes], together with a notion of “morphisms” between two partic- ular examples of a specific type of mathematical object [i.e., morphisms between groups, between topological spaces, or between schemes]. Such objects and mor- phisms [typically] determine a category. On the other hand, if one restricts one’s attention to such a category, then one must keep in mind the fact that the structure of the category i.e., which consists only of a collection of objects and morphisms satisfying certain properties! does not include any mention of the various sets and conditions satisfied by those sets that give rise to the “type of mathematical object” under consideration. For instance, the data consisting of the underlying set of a group, the group multiplication law on the group, and the properties sat- isfied by this group multiplication law cannot be recovered [at least in an a priori sense!] from the structure of the “category of groups”. Put another way, although the notion of a “type of mathematical object” may give rise to a “category of such objects”, the notion of a “type of mathematical object” is much stronger in the sense that it involves much more mathematical structure than the notion of a cat- egory. Indeed, a given “type of mathematical object” may have a very complicated internal structure, but may give rise to a category equivalent to a one-morphism category [i.e., a category with precisely one morphism]; in particular, in such cases, the structure of the associated category does not retain any information of inter- est concerning the internal structure of the “type of mathematical object” under consideration. 4 SHINICHI MOCHIZUKI In Definition 3.1, (iii), we formalize this intuitive notion of a “type of mathe- matical object” by defining the notion of a species as, roughly speaking, a collection of set-theoretic formulas that gives rise to a category in any given model of set the- ory [cf. Definition 3.1, (iv)], but, unlike any specific category [e.g., of groups, etc.] is not confined to any specific model of set theory. In a similar vein, by working with collections of set-theoretic formulas, one may define a species-theoretic ana- logue of the notion of a functor, which we refer to as a mutation [cf. Definition 3.3, (i)]. Given a diagram of mutations, one may then define the notion of a “mutation that extracts, from the diagram, a certain portion of the types of mathematical objects that appear in the diagram that is invariant with respect to the mutations in the diagram”; we refer to such a mutation as a core [cf. Definition 3.3, (v)]. One fundamental example, in the context of the present series of papers, of a diagram of mutations is the usual set-up of [absolute] anabelian geometry [cf. Example 3.5 for more details]. That is to say, one begins with the species constituted by schemes satisfying certain conditions. One then considers the mutation X  Π X that associates to such a scheme X its étale fundamental group Π X [say, considered up to inner automorphisms]. Here, it is important to note that the codomain of this mutation is the species constituted by topological groups [say, considered up to inner automorphisms] that satisfy certain conditions which do not include any information concerning how the group is related [for instance, via some sort of étale fundamental group mutation] to a scheme. The notion of an anabelian reconstruction algorithm may then be formalized as a mutation that forms a “mutation-quasi-inverse” to the fundamental group mutation. Another fundamental example, in the context of the present series of papers, of a diagram of mutations arises from the Frobenius morphism in positive characteristic scheme theory [cf. Example 3.6 for more details]. That is to say, one fixes a prime number p and considers the species constituted by reduced quasi-compact schemes of characteristic p and quasi-compact morphisms of schemes. One then considers the mutation that associates S  S (p) to such a scheme S the scheme S (p) with the same topological space, but whose regular functions are given by the p-th powers of the regular functions on the original scheme. Thus, the domain and codomain of this mutation are given by the same species. One may also consider a log scheme version of this example, which, at the level of monoids, corresponds, in essence, to assigning M  p · M to a torsion-free abelian monoid M the submonoid p · M M determined by the image of multiplication by p. Returning to the case of schemes, one may then observe that the well-known constructions of the perfection and the étale site S  S pf ; S  S ét associated to a reduced scheme S of characteristic p give rise to cores of the diagram obtained by considering iterates of the “Frobenius mutation” just discussed. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 5 This last example of the Frobenius mutation and the associated core consti- tuted by the étale site is of particular importance in the context of the present series of papers in that it forms the “intuitive prototype” that underlies the theory of the vertical and horizontal lines of the log-theta-lattice [cf. the discussion of Remark 3.6.1, (i)]. One notable aspect of this example is the [evident!] fact that the domain and codomain of the Frobenius mutation are given by the same species. That is to say, despite the fact that in the construction of the scheme S (p) [cf. the notation of the preceding paragraph] from the scheme S, the scheme S (p) is “subordinate” to the scheme S, the domain and codomain species of the resulting Frobenius mutation coincide, hence, in particular, are on a par with one another. This sort of situation served, for the author, as a sort of model for the log- and Θ ×μ LGP -links of the log-theta-lattice, which may be formulated as muta- tions between the species constituted by the notion of a Θ ±ell NF-Hodge theater. That is to say, although in the construction of either the log- or the Θ ×μ LGP -link, the ±ell domain and codomain Θ NF-Hodge theaters are by no means on a “par” with one another, the domain and codomain Θ ±ell NF-Hodge theaters of the resulting log-/Θ ×μ LGP -links are regarded as objects of the same species, hence, in particular, completely on a par with one another. This sort of “relativization” of distinct models of conventional scheme theory over Z via the notion of a Θ ±ell NF-Hodge theater [cf. Fig. I.1 below; the discussion of “gluing together” such models of con- ventional scheme theory in [IUTchI], §I2] is one of the most characteristic features of the theory developed in the present series of papers and, in particular, lies [tauto- logically!] outside the framework of conventional scheme theory over Z. That is to say, in the framework of conventional scheme theory over Z, if one starts out with schemes over Z and constructs from them, say, by means of geometric objects such as the theta function on a Tate curve, some sort of Frobenioid that is isomorphic to a Frobenioid associated to Z, then unlike, for instance, the case of the Frobenius morphism in positive characteristic scheme theory there is no way, within the framework of conventional scheme theory, to treat the newly constructed Frobenioid “as if it is the Frobenioid associated to Z, relative to some new version/model of conventional scheme theory”. non- scheme- ————— theoretic link one model of conven- tional scheme theory over Z non- scheme- ————— theoretic link another model of conven- tional scheme theory over Z non- scheme- ————— theoretic link ... ... Fig. I.1: Relativized models of conventional scheme theory over Z If, moreover, one thinks of Z as being constructed, in the usual way, via ax- iomatic set theory, then one may interpret the “absolute” i.e., “tautologically 6 SHINICHI MOCHIZUKI unrelativizable” nature of conventional scheme theory over Z at a purely set- theoretic level. Indeed, from the point of view of the “∈-structure” of axiomatic set theory, there is no way to treat sets constructed at distinct levels of this ∈-structure as being on a par with one another. On the other hand, if one focuses not on the level of the ∈-structure to which a set belongs, but rather on species, then the notion of a species allows one to relate i.e., to treat on a par with one another objects belonging to the species that arise from sets constructed at distinct levels of the ∈-structure. That is to say, the notion of a species allows one to “simulate ∈-loops” without vio- lating the axiom of foundation of axiomatic set theory cf. the discussion of Remark 3.3.1, (i). As one constructs sets at new levels of the ∈-structure of some model of ax- iomatic set theory e.g., as one travels along vertical or horizontal lines of the log-theta-lattice! one typically encounters new schemes, which give rise to new Galois categories, hence to new Galois or étale fundamental groups, which may only be constructed if one allows oneself to consider new basepoints, relative to new universes. In particular, one must continue to extend the universe, i.e., to modify the model of set theory, relative to which one works. Here, we recall in passing that such “extensions of universe” are possible on account of an existence axiom concerning universes, which is apparently attributed to the “Grothendieck school” and, moreover, cannot, apparently, be obtained as a consequence of the conven- tional ZFC axioms of axiomatic set theory [cf. the discussion at the beginning of §3 for more details]. On the other hand, ultimately in the present series of papers [cf. the discussion of [IUTchIII], Introduction], we wish to obtain algorithms for constructing various objects that arise in the context of the new schemes/universes discussed above i.e., at distant Θ ±ell NF-Hodge theaters of the log-theta-lattice that make sense from the point of view of the original schemes/universes that occurred at the outset of the discussion. Again, the fundamental tool that makes this possible, i.e., that allows one to express constructions in the new universes in terms that makes sense in the original universe is precisely the species-theoretic formulation i.e., the formulation via set- theoretic formulas that do not depend on particular choices invoked in particular universes of the constructions of interest cf. the discussion of Remarks 3.1.2, 3.1.3, 3.1.4, 3.1.5, 3.6.2, 3.6.3. This is the point of view that gave rise to the term “inter-universal”. At a more con- crete level, this “inter-universal” contact between constructions in distant models of conventional scheme theory in the log-theta-lattice is realized by considering [the étale-like structures given by] the various Galois or étale fundamental groups that occur as [the “type of mathematical object”, i.e., species constituted by] abstract topological groups [cf. the discussion of Remark 3.6.3, (i); [IUTchI], §I3]. These abstract topological groups give rise to vertical or horizontal cores of the log- theta-lattice [cf. the discussion of [IUTchIII], Introduction; [IUTchIII], Theorem 1.5, (i), (ii)]. Moreover, once one obtains cores that are sufficiently “nondegener- ate”, or “rich in structure”, so as to serve as containers for the non-coric portions of INTER-UNIVERSAL TEICHMÜLLER THEORY IV 7 the various mutations [e.g., vertical and horizontal arrows of the log-theta-lattice] under consideration, then one may construct the desired algorithms, or descrip- tions, of these non-coric portions in terms of coric containers, up to certain relatively mild indeterminacies [i.e., which reflect the non-coric nature of these non-coric portions!] cf. the illustration of this sort of situation given in Fig. I.2 below; Remark 3.3.1, (iii); Remark 3.6.1, (ii). In the context of the log-theta-lattice, this is precisely the sort of situation that was achieved in [IUTchIII], Theorem A [cf. the discussion of [IUTchIII], Introduction]. ...       ... ? Fig. I.2: A coric container underlying a sequence of mutations In the context of the above discussion of set-theoretic aspects of the theory developed in the present series of papers, it is of interest to note the following observation, relative to the analogy between the theory of the present series of papers and p-adic Teichmüller theory [cf. the discussion of [IUTchI], §I4]. If, instead of working species-theoretically, one attempts to document all of the possible choices that occur in various newly introduced universes that occur in a construc- tion, then one finds that one is obliged to work with sets, such as sets obtained via set-theoretic exponentiation, of very large cardinality. Such sets of large cardinality are reminiscent of the exponentially large denominators that occur if one attempts to p-adically formally integrate an arbitrary connection as opposed to a canonical crystalline connection of the sort that occurs in the context of the canonical liftings of p-adic Teichmüller theory [cf. the discussion of Remark 3.6.2, (iii)]. In this context, it is of interest to recall the computations of [Finot], which assert, roughly speaking, that the canonical liftings of p-adic Teichmüller theory may, in certain cases, be characterized as liftings “of minimal complexity” in the sense that their Witt vector coordinates are given by polynomials of minimal degree. Finally, we observe that although, in the above discussion, we concentrated on the similarities, from an “inter-universal” point of view, between the vertical and horizontal arrows of the log-theta-lattice, there is one important difference between these vertical and horizontal arrows: namely, · whereas the copies of the full arithmetic fundamental group i.e., in particular, the copies of the geometric fundamental group on either side of a vertical arrow are identified with one another, · in the case of a horizontal arrow, only the Galois groups of the local base fields on either side of the arrow are identified with one another 8 SHINICHI MOCHIZUKI cf. the discussion of Remark 3.6.3, (ii). One way to understand the reason for this difference is as follows. In the case of the vertical arrows i.e., the log- links, which, in essence, amount to the various local p-adic logarithms in order to construct the log-link, it is necessary to make use, in an essential way, of the local ring structures at v V [cf. the discussion of [IUTchIII], Definition 1.1, (i), (ii)], which may only be reconstructed from the full arithmetic fundamental group. By contrast, in order to construct the horizontal arrows i.e., the Θ ×μ LGP - links this local ring structure is unnecessary. On the other hand, in order to construct the horizontal arrows, it is necessary to work with structures that, up to isomorphism, are common to both the domain and the codomain of the arrow. Since the construction of the domain of the Θ ×μ LGP -link depends, in an essential way, on the Gaussian monoids, i.e., on the labels F  l for the theta values, which are constructed from the geometric fundamental group, while the codomain only involves monoids arising from the local q-parameters “q [for v V bad ], which v are constructed in a fashion that is independent of these labels, in order to obtain an isomorphism between structures arising from the domain and codomain, it is necessary to restrict one’s attention to the Galois groups of the local base fields, which are free of any dependence on these labels. Acknowledgements: The research discussed in the present paper profited enormously from the gen- erous support that the author received from the Research Institute for Mathematical Sciences, a Joint Usage/Research Center located in Kyoto University. At a personal level, I would like to thank Fumiharu Kato, Akio Tamagawa, Go Yamashita, Mo- hamed Saı̈di, Yuichiro Hoshi, Ivan Fesenko, Fucheng Tan, Emmanuel Lepage, Arata Minamide, and Wojciech Porowski for many stimulating discussions concerning the material presented in this paper. Also, I feel deeply indebted to Go Yamashita, Mohamed Saı̈di, and Yuichiro Hoshi for their meticulous reading of and numer- ous comments concerning the present paper. In addition, I would like to thank Kentaro Sato for useful comments concerning the set-theoretic and foundational aspects of the present paper, as well as Vesselin Dimitrov and Akshay Venkatesh for useful comments concerning the analytic number theory aspects of the present paper. Finally, I would like to express my deep gratitude to Ivan Fesenko for his quite substantial efforts to disseminate for instance, in the form of a survey that he wrote the theory discussed in the present series of papers. Notations and Conventions: We shall continue to use the “Notations and Conventions” of [IUTchI], §0. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 9 Section 1: Log-volume Estimates In the present §1, we perform various elementary local computations con- cerning nonarchimedean and archimedean local fields which allow us to obtain more explicit versions [cf. Theorem 1.10 below] of the log-volume estimates for Θ- pilot objects obtained in [IUTchIII], Corollary 3.12. In the following, if λ R, then we shall write λ (respectively, ) for the smallest (respectively, largest) n Z such that n λ (respectively, n λ). Also, we shall write “log(−)” for the natural logarithm of a positive real number. Proposition 1.1. (Multiple Tensor Products and Differents) Let p be a prime number, I a finite set of cardinality 2, Q p an algebraic closure of Q p . × Write R Q p for the ring of integers of Q p and ord : Q p Q for the natural p-adic valuation on Q p , normalized so that ord(p) = 1; for λ Q, we shall write p λ for “some” [unspecified] element of Q p such that ord(p λ ) = λ. For i I, let  def k i Q p be a finite extension of Q p ; write R i = O k i = R k i for the ring of integers of k i and d i Q ≥0 for the order [i.e., “ord(−)”] of any generator of the different ideal of R i over Z p . Also, for any nonempty subset E I, let us write R E def =  R i ; d E def = i∈E  d i i∈E def where the tensor product is over Z p . Fix an element I; write I = I \ {∗}. Then p d I · (R I ) R I (R I ) where we write “(−) for the normalization of the [reduced] ring in paren- theses in its ring of fractions, and we observe that it follows immediately from the definition of the “normalization” that the notation on the left-hand side of the first inclusion of the above display is well-defined for suitable “p d I [such as products of elements p d i R i , for i I ] and independent of the choice of such suitable “p d I ”. Proof. Let us regard R I as an R -algebra in the evident fashion. It is immediate from the definitions that R I (R I ) . Now observe that R R R I R R (R I ) (R R R I ) where (R R R I ) decomposes as a direct sum of finitely many copies of R. In particular, one verifies immediately, in light of the fact the R is faithfully flat over R , that to complete the proof of Proposition 1.1, it suffices to verify that p d I · (R R R I ) R R R I 10 SHINICHI MOCHIZUKI where we observe that it follows immediately from the definition of the “nor- malization” that the notation on the left-hand side of the inclusion of the above display is well-defined and independent of the choice of “p d I ”. On the other hand, it follows immediately from induction on the cardinality of I that to verify this last inclusion, it suffices to verify the inclusion in the case where I is of cardinality two. But in this case, the desired inclusion follows immediately from the definition of the different ideal. This completes the proof of Proposition 1.1. Proposition 1.2. (Differents and Logarithms) We continue to use the notation of Proposition 1.1. For i I, write e i for the ramification index of k i over Q p ; def a i = 1 e i  if p > 2, · e i p 2 def def b i =  a i = 2 if p = 2; 1 log(p · e i /(p 1)) . log(p) e i Thus, 1 = −b i . e i if p > 2 and e i p 2, then a i = For any nonempty subset E I, let us write × log p (R E ) def =  log p (R i × );  def a E = i∈E a i ; i∈E b E def =  b i i∈E where the tensor product is over Z p ; we write “log p (−)” for the p-adic logarithm. For λ e 1 i · Z, we shall write p λ · R i for the fractional ideal of R i generated by any element “p λ of k i such that ord(p λ ) = λ. Let φ : log p (R I × ) Q p log p (R I × ) Q p be an automorphism of the finite dimensional Q p -vector space log p (R I × )⊗Q p that induces an automorphism of the submodule log p (R I × ). Then: (i) We have: p a i · R i log p (R i × ) p −b i · R i where the “⊆’s” are equalities when p > 2 and e i p 2. (ii) We have: φ(p λ · (R I ) ) p λ−d I −a I  · log p (R I × ) p λ−d I −a I −b I · (R I ) for any λ e 1 i · Z, i I. [Here, we observe that, just as in Proposition 1.1, it follows immediately from the definition of the “normalization” that the notation of the above display is well-defined and independent of the various choices involved.] In particular, φ((R I ) ) p d I +a I · log p (R I × ) p d I +a I −b I · (R I ) . INTER-UNIVERSAL TEICHMÜLLER THEORY IV 11 (iii) Suppose that p > 2, and that e i p 2 for all i I. Then we have: φ(p λ · (R I ) ) p λ−d I −1 · (R I ) for any λ e 1 i · Z, i I. [Here, we observe that, just as in Proposition 1.1, it follows immediately from the definition of the “normalization” that the notation of the above display is well-defined and independent of the various choices involved.] In particular, φ((R I ) ) p −d I −1 · (R I ) . (iv) If p > 2 and e i = 1 for all i I, then φ((R I ) ) (R I ) . bi + 1 ei 1 1 Proof. Since a i > p−1 , p e i > p−1 [cf. the definition of “−”, “− ”!], asser- tion (i) follows immediately from the well-known theory of the p-adic logarithm and exponential maps [cf., e.g., [Kobl], p. 81]. Next, we consider assertion (ii). Observe that it follows from the first displayed inclusion [of R I -modules!] of Proposition 1.1 that     R I = p a i · R i R i p d I +a I · (R I ) i∈I i∈I and hence that p λ · (R I ) p λ−d I −a I · p d I +a I · (R I ) p λ−d I −a I  · p d I +a I · (R I ) p λ−d I −a I  · log p (R I × ) p λ−d I −a I −b I · (R I ) where, in the passage to the third and fourth inclusions following “p λ ·(R I ) ”, we apply assertion (i). [Here, we observe that, just as in Proposition 1.1, it follows im- mediately from the definition of the “normalization” that the notation of the above two displays is well-defined and independent of the various choices involved.] Thus, assertion (ii) follows immediately from the fact that φ induces an automorphism of the submodule log p (R I × ). Assertion (iii) follows from assertion (ii), together with the fact that if p > 2 and e i p 2 for all i I, then we have a I = −b I , which implies that d I a I b I λ d I a I 1 b I λ d I 1. Assertion (iv) follows from assertion (ii), together with the fact that if p > 2 and e i = 1 for all i I, then we have d I = 0, a I = −b I Z. This completes the proof of Proposition 1.2. Proposition 1.3. (Estimates of Differents) We continue to use the notation of Proposition 1.2. Suppose that k 0 k i is a subfield that contains Q p . Write def R 0 = O k 0 for the ring of integers of k 0 , d 0 for the order [i.e., “ord(−)”] of any generator of the different ideal of R 0 over Z p , e 0 for the ramification index of k 0 def over Q p , e i/0 = e i /e 0 (∈ Z), [k i : k 0 ] for the degree of the extension k i /k 0 , n i for the unique nonnegative integer such that [k i : k 0 ]/p n i is an integer prime to p. Then: (i) We have: d i d 0 + (e i/0 1)/(e i/0 · e 0 ) = d 0 + (e i/0 1)/e i 12 SHINICHI MOCHIZUKI where the “≥” is an equality if k i is tamely ramified over k 0 . (ii) Suppose that k i is a finite Galois extension of a subfield k 1 k i such that k 0 k 1 , and k 1 is tamely ramified over k 0 . Then we have: d i d 0 + n i + 1/e 0 . Proof. By replacing k 0 by an unramified extension of k 0 contained in k i , we may assume without loss of generality in the following discussion that k i is a totally ramified extension of k 0 . First, we consider assertion (i). Let π 0 be a uniformizer of R 0 . Then there exists an isomorphism of R 0 -algebras R 0 [x]/(f (x)) R i , where f (x) R 0 [x] is a monic polynomial which is x e i/0 (mod π 0 ), that maps x → π i for some uniformizer π i of R i . Thus, the different d i may be computed as follows: e −1 d i d 0 = ord(f i )) min(ord(π 0 ), ord(e i/0 · π i i/0 ))  1   1 e 1  e i/0 1 e −1 i/0 min = , ord(π i i/0 ) = min , e 0 e 0 e i/0 · e 0 e i def where, for λ, μ R such that λ μ, we define min(λ, μ) = μ. When k i is tamely ramified over k 0 , one verifies immediately that the inequalities of the above display are, in fact, equalities. This completes the proof of assertion (i). Next, we consider assertion (ii). We apply induction on n i . Since assertion (ii) follows immediately from assertion (i) when n i = 0, we may assume that n i 1, and that assertion (ii) has been verified for smaller “n i ”. By replacing k 1 by some tamely ramified extension of k 1 contained in k i , we may assume without loss of generality that Gal(k i /k 1 ) is a p-group. Since p-groups are solvable, and k i is a totally ramified extension of k 0 , it follows that there exists a subextension k 1 k k i such that k i /k and k /k 1 are Galois extensions of degree p and p n i −1 , respectively. Write def R = O k for the ring of integers of k , d for the order [i.e., “ord(−)”] of any generator of the different ideal of R over Z p , and e for the ramification index of k over Q p . Thus, by the induction hypothesis, it follows that d d 0 +n i −1+1/e 0 . To verify that d i d 0 + n i + 1/e 0 , it suffices to verify that d i d 0 + n i + 1/e 0 +  for any positive real number . Thus, let us fix a positive real number . Then by possibly enlarging k i and k 1 , we may also assume without loss of generality that the tamely ramified extension k 1 of k 0 contains a primitive p-th root of unity, and, moreover, that the ramification index e 1 of k 1 over Q p satisfies the inequality e 1 p/ [so e e 1 p/]. Thus, k i is a Kummer extension of k . In particular, there exists an inclusion of R -algebras R [x]/(f (x)) → R i , where f (x) R [x] is a monic polynomial which is of the form f (x) = x p  for some element  of R satisfying the estimates 0 ord( ) p−1 e , that maps x →  i for some p−1 element  i of R i satisfying the estimates 0 ord( i ) p·e . Now we compute: d i ord(f ( i )) + d ord(p ·  i p−1 ) + d 0 + n i 1 + 1/e 0 = (p 1) · ord( i ) + d 0 + n i + 1/e 0 (p 1) 2 + d 0 + n i + 1/e 0 p · e p + d 0 + n i + 1/e 0 d 0 + n i + 1/e 0 +  e thus completing the proof of assertion (ii). INTER-UNIVERSAL TEICHMÜLLER THEORY IV 13 Remark 1.3.1. Similar estimates to those discussed in Proposition 1.3 may be found in [Ih], Lemma A. Proposition 1.4. (Nonarchimedean Normalized Log-volume Estimates) We continue to use the notation of Proposition 1.2. Also, for i I, write R i μ R i × for the torsion subgroup of R i × , R i ×μ = R i × /R i μ , p f i for the cardinality of the residue field of k i , and p m i for the order of the p-primary component of R i μ . Thus, the order of R i μ is equal to p m i · (p f i 1). Then: def (i) The log-volumes constructed in [AbsTopIII], Proposition 5.7, (i), on the various finite extensions of Q p contained in Q p may be suitably normalized [i.e., by dividing by the degree of the finite extension] so as to yield a notion of log-volume μ log (−) defined on compact open subsets of finite extensions of Q p contained in Q p , valued in R, and normalized so that μ log (R i ) = 0, μ log (p · R i ) = −log(p), for each i I. Moreover, by applying the fact that tensor products of finitely many finite extensions of Q p over Z p decompose, naturally, as direct sums of finitely many finite extensions of Q p , we obtain a notion of log-volume which, by abuse of notation, we shall also denote by “μ log (−)” defined on compact open subsets of such tensor products, valued in R, and normalized so that μ log ((R E ) ) = 0, μ log (p · (R E ) ) = −log(p), for any nonempty set E I. (ii) We have: μ log (log p (R i × )) =  1 e i + m i  · log(p) e i f i [cf. [AbsTopIII], Proposition 5.8, (iii)]. (iii) Let I I be a subset such that for each i I \ I , it holds that p 2 e i (≥ 1). Then for any λ e 1 · Z, i I, we have inclusions φ(p λ · (R I ) ) i p λ−d I −a I  · log p (R I × ) p λ−d I −a I −b I · (R I ) and inequalities μ log λ−d I −a I  (p · log p (R I × )) μ log (p λ−d I −a I −b I · (R I ) )    λ + d I + 1 + 4 · |I |/p · log(p);   λ + d I + 1 · log(p) + {3 + log(e i )} i∈I where we write “|(−)|” for the cardinality of the set “(−)”. Moreover, d I + a I |I| if p > 2; d I + a I 2 · |I| if p = 2. μ log (iv) If p > 2 and e i = 1 for all i I, then φ((R I ) ) ((R I ) ) = 0. (R I ) , and Proof. Assertion (i) follows immediately from the definitions. Next, we consider assertion (ii). We begin by observing that every compact open subset of R i ×μ may be covered by a finite collection of compact open subsets of R i ×μ that arise as 14 SHINICHI MOCHIZUKI images of compact open subsets of R i × that map injectively to R i ×μ . In particular, by applying this observation, we conclude that the log-volume on R i × determines, in a natural way, a log-volume on the quotient R i ×  R i ×μ . Moreover, in light of the compatibility of the log-volume with “log p (−)” [cf. [AbsTopIII], Proposition 5.7, (i), (c)], it follows immediately that μ log (log p (R i × )) = μ log (R i ×μ ). Thus, it suffices to compute e i · f i · μ log (R i ×μ ) = e i · f i · μ log (R i × ) log(p m i · (p f i 1)). On the other hand, it follows immediately from the basic properties of the log-volume [cf. [AbsTopIII], Proposition 5.7, (i), (a)] that e i · f i · μ log (R i × ) = log(1 p −f i ), so e i · f i · μ log (R i ×μ ) = −(f i + m i ) · log(p), as desired. This completes the proof of assertion (ii). The inclusions of assertion (iii) follow immediately from Proposition 1.2, (ii). When p = 2, the fact that d I + a I 2 · |I| follows immediately from the definition of “d i and “a i in Propositions 1.1, 1.2. When p > 2, it follows immediately from the definition of “a i in Proposition 1.2 that a i 1/e i , for all i I; thus, since d i (e i 1)/e i for all i I [cf. Proposition 1.3, (i)], we conclude that d i + a i 1 for all i I, and hence that d I + a I |I|, as asserted in the statement of p 1 p 4 for p 3; p−1 2 for p 2; assertion (iii). Next, let us observe that p−2 2 1 p log(p) for p 2. Thus, it follows immediately from the definition of a i , b i in 2 Proposition 1.2 that a i e 1 i p 4 log(p) , (b i + e 1 i ) · log(p) log(2e i ) 1 + log(e i ) for i I; a i = e 1 i = −b i for i I \ I . On the other hand, by assertion (i), we have μ log (R I ) μ log ((R I ) ) = 0; by assertion (ii), we have μ log (log p (R i × )) e 1 i ·log(p). Now we compute:   μ log (p λ−d I −a I  · log p (R I × )) λ + d I + a I + 1 · log(p) + μ log (log p (R I × ))   = λ + d I + a I + 1 · log(p)   μ log (log p (R i × )) + μ log (R I ) + i∈I 1  (a i ) · log(p) λ + d I + 1 + e i i∈I   λ + d I + 1 + 4 · |I |/p · log(p);   μ log (p λ−d I −a I −b I · (R I ) ) λ + d I + a I + b I + 1 · log(p)    {3 + log(e i )} λ + d I + 1 · log(p) +   i∈I thus completing the proof of assertion (iii). Assertion (iv) follows immediately from assertion (i) and Proposition 1.2, (iv). Proposition 1.5. (Archimedean Metric Estimates) In the following, we shall regard the complex archimedean field C as being equipped with its standard Hermitian metric, i.e., the metric determined by the complex norm. Let us refer to as the primitive automorphisms of C the group of automorphisms [of order 8] of the underlying metrized real vector space of C generated by the operations of complex conjugation and multiplication by ±1 or ± −1. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 15 (i) (Direct Sum vs. Tensor Product Metrics) The metric on C deter- mines a tensor product metric on C R C, as well as a direct sum metric on C C. Then, relative to these metrics, any isomorphism of topological rings [i.e., arising from the Chinese remainder theorem] C R C C C is compatible with these metrics, up to a factor of 2, i.e., the metric on the right- hand side corresponds to 2 times the metric on the left-hand side. [Thus, lengths differ by a factor of 2.] (ii) (Direct Sum vs. Tensor Product Automorphisms) Relative to the notation of (i), the direct sum decomposition C C, together with its Her- mitian metric, is preserved, relative to the displayed isomorphism of (i), by the automorphisms of C R C induced by the various primitive automorphisms of the two copies of “C” that appear in the tensor product C R C. (iii) (Direct Sums and Tensor Products of Multiple Copies) Let I, V be nonempty finite sets, whose cardinalities we denote by |I|, |V |, respectively. Write def = C v M v∈V def for the direct sum of copies C v = C of C labeled by v V , which we regard as equipped with the direct sum metric, and def = M I  M i i∈I def for the tensor product over R of copies M i = M of M labeled by i I, which we regard as equipped with the tensor product metric [cf. the constructions of [IUTchIII], Proposition 3.2, (ii)]. Then the topological ring structure on each C v determines a topological ring structure on M I with respect to which M I admits a unique direct sum decomposition as a direct sum of 2 |I|−1 · |V | |I| copies of C [cf. [IUTchIII], Proposition 3.1, (i)]. The direct sum metric on M I i.e., the metric determined by the natural metrics on these copies of C is equal to 2 |I|−1 times the original tensor product metric on M I . Write B I M I for the “integral structure” [cf. the constructions of [IUTchIII], Proposition 3.1, (ii)] given by the direct product of the unit balls of the copies of C that occur in the direct sum decomposition of M I . Then the tensor product metric on M I , the direct sum decomposition of M I , the direct sum metric on M I , and the integral 16 SHINICHI MOCHIZUKI structure B I M I are preserved by the automorphisms of M I induced by the various primitive automorphisms of the direct summands “C v that appear in the factors “M i of the tensor product M I . (iv) (Tensor Product of Vectors of a Given Length) Suppose that we are in the situation of (iii). Fix λ R >0 . Then  M I  m i λ |I| · B I i∈I for any collection of elements {m i M i } i∈I such that the component of m i in each direct summand “C v of M i is of length λ. Proof. Assertions (i) and (ii) are discussed in [IUTchIII], Remark 3.9.1, (ii), and may be verified by means of routine and elementary arguments. Assertion (iii) follows immediately from assertions (i) and (ii). Assertion (iv) follows immediately from the various definitions involved. Proposition 1.6. (The Prime Number Theorem) If n is a positive integer, then let us write p n for the n-th smallest prime number. [Thus, p 1 = 2, p 2 = 3, and so on.] Then there exists an integer n 0 such that it holds that 4p n 3·log(p n ) n for all n n 0 . In particular, there exists a positive real number η prm such that  1 3·log(η) p≤η where the sum ranges over the prime numbers p η for all positive real η η prm . Proof. Relative to our notation, the Prime Number Theorem [cf., e.g., [DmMn], §3.10] implies that n · log(p n ) lim = 1 n→∞ p n i.e., in particular, that for some positive integer n 0 , it holds that log(p n ) 4 1 · p n 3 n for all n n 0 . The final portion of Proposition 1.6 follows formally. Proposition 1.7. (Weighted Averages) Let E be a nonempty finite set, n a positive integer. For e E, let λ e R >0 , β e R. Then, for any i = 1, . . . , n, we have:   β e · λ Πe n · β e i · λ Πe  e ∈E n   e ∈E n = λ Πe  e ∈E n   e ∈E n = λ Πe n · β avg INTER-UNIVERSAL TEICHMÜLLER THEORY IV where we write def β avg = β E E , β e def = n  def β E = β e j ; e∈E λ Πe 17 def β e · λ e , λ E = e∈E λ e , n def = λ e j j=1 j=1 for any n-tuple e = (e 1 , . . . , e n ) E n of elements of E. Proof. We begin by observing that  λ nE = λ Πe ; β E · λ n−1 E  e ∈E n  = β e i · λ Πe  e ∈E n for any i = 1, . . . , n. Thus, summing over i, we obtain that   n · β E · λ n−1 = β · λ = n · β e i · λ Πe  e Π e E  e ∈E n  e ∈E n and hence that n n · β avg = n · β E · λ n−1 E E =    e ∈E n =    β e · λ Πe · λ Πe  −1  e ∈E n      n · β e i · λ Πe ·  e ∈E n  e ∈E n λ Πe  −1 as desired. Remark 1.7.1. In Theorem 1.10 below, we shall apply Proposition 1.7 to com- pute various packet-normalized log-volumes of the sort discussed in [IUTchIII], Proposition 3.9, (i) i.e., log-volumes normalized by means of the normalized weights discussed in [IUTchIII], Remark 3.1.1, (ii). Here, we recall that the nor- malized weights discussed in [IUTchIII], Remark 3.1.1, (ii), were computed relative to the non-normalized log-volumes of [AbsTopIII], Proposition 5.8, (iii), (vi) [cf. the discussion of [IUTchIII], Remark 3.1.1, (ii); [IUTchI], Example 3.5, (iii)]. By contrast, in the discussion of the present §1, our computations are performed rela- tive to normalized log-volumes as discussed in Proposition 1.4, (i). In particular, it follows that the weights [K v : (F mod ) v ] −1 , where V  v | v V mod , of the dis- cussion of [IUTchIII], Remark 3.1.1, (ii), must be replaced i.e., when one works with normalized log-volumes as in Proposition 1.4, (i) by the weights [K v : Q v Q ] · [K v : (F mod ) v ] −1 = [(F mod ) v : Q v Q ] where V mod  v | v Q V Q . This means that the normalized weights of the final display of [IUTchIII], Remark 3.1.1, (ii), must be replaced, when one works with normalized log-volumes as in Proposition 1.4, (i), by the normalized weights   [(F mod ) v α : Q v Q ]  α∈A {w α } α∈A  [(F mod ) w α : Q v Q ] α∈A  18 SHINICHI MOCHIZUKI where the sum is over all collections {w α } α∈A of [not necessarily distinct!] ele- ments w α V mod lying over v Q and indexed by α A. Thus, in summary, when one works with normalized log-volumes as in Proposition 1.4, (i), the appropriate normalized weights are given by the expressions λ  Πe λ Πe  e ∈E n [where e E n ] that appear in Proposition 1.7. Here, one takes “E” to be the set of elements of V V mod lying over a fixed v Q ; one takes “n” to be the cardinality of A, so that one can write A = 1 , . . . , α n } [where the α i are distinct]; if e E corresponds to v V, v V mod , then one takes def “λ e = [(F mod ) v : Q v Q ] R >0 and “β e to be a normalized log-volume of some compact open subset of K v . Before proceeding, we review some well-known elementary facts concerning elliptic curves. In the following, we shall write M ell for the moduli stack of elliptic curves over Z and M ell M ell for the natural compactification of M ell , i.e., the moduli stack of one-dimensional def semi-abelian schemes over Z. Also, if R is a Z-algebra, then we shall write (M ell ) R = def M ell × Z R, (M ell ) R = M ell × Z R. Proposition 1.8. (Torsion Points of Elliptic Curves) Let k be a perfect def field, k an algebraic closure of k. Write G k = Gal(k/k). (i) (“Serre’s Criterion”) Let l 3 be a prime number that is invertible in k; suppose that k = k. Let A be an abelian variety over k, equipped with a polarization λ. Write A[l] A(k) for the group of l-torsion points of A(k). Then the natural map φ : Aut k (A, λ) Aut(A[l]) from the group of automorphisms of the polarized abelian variety (A, λ) over k to the group of automorphisms of the abelian group A[l] is injective. (ii) Let E k be an elliptic curve over k with origin  E E(k). For n a positive integer, write E k [n] E k (k) for the module of n-torsion points of E k (k) and Aut k (E k ) Aut k (E k ) for the respective groups of  E -preserving automorphisms of the k-scheme E k and the k-scheme E k . Then we have a natural exact sequence 1 −→ Aut k (E k ) −→ Aut k (E k ) −→ G k INTER-UNIVERSAL TEICHMÜLLER THEORY IV 19 where the image G E G k of the homomorphism Aut k (E k ) G k is open and a natural representation ρ n : Aut k (E k ) Aut(E k [n]) on the n-torsion points of E k . The finite extension k E of k determined by G E is the minimal field of definition of E k , i.e., the field generated over k by the j- invariant of E k . Finally, if H G k is any closed subgroup, which corresponds to an extension k H of k, then the datum of a model of E k over k H [i.e., descent data for E k from k to k H ] is equivalent to the datum of a section of the homomorphism Aut k (E k ) G k over H. In particular, the homomorphism Aut k (E k ) G k admits a section over G E . (iii) In the situation of (ii), suppose further that Aut k (E k ) = {±1}. Then the representation ρ 2 factors through G E and hence defines a natural representa- tion G E Aut(E k [2]). (iv) In the situation of (ii), suppose further that l 3 is a prime number that is invertible in k, and that E k descends to elliptic curves E k and E k over k, all of whose l-torsion points are rational over k. Then E k is isomorphic to E k over k. (v) In the situation of (ii), suppose further that k is a complete discrete valuation field with ring of integers O k , that l 3 is a prime number that is invertible in O k , and that E k descends to an elliptic curve E k over k, all of whose l-torsion points are rational over k. Then E k has semi-stable reduction over O k [i.e., extends to a semi-abelian scheme over O k ]. (vi) In the situation of (iii), suppose further that 2 is invertible in k, that G E = G k , and that the representation G E Aut(E k [2]) is trivial. Then E k descends to an elliptic curve E k over k which is defined by means of the Legendre form of the Weierstrass equation [cf., e.g., the statement of Corollary 2.2, below]. If, moreover, k is a complete discrete valuation field with ring of integers O k such that 2 is invertible in O k , then E k has semi-stable reduction over O k  [i.e., extends to a semi-abelian scheme over O k  ] for some finite extension k k of k such that [k : k] 2; if E k has good reduction over O k  [i.e., extends to an abelian scheme over O k  ], then one may in fact take k to be k. (vii) In the situation of (ii), suppose further that k is a complete discrete valuation field with ring of integers O k , that E k descends to an elliptic curve E k over k, and that n is invertible in O k . If E k has good reduction over O k [i.e., extends to an abelian scheme over O k ], then the action of G k on E k [n] is unramified. If E k has bad multiplicative reduction over O k [i.e., extends to a non-proper semi-abelian scheme over O k ], then the kernel of the action of G k on E k [n] determines a tamely ramified extension of k whose ramification index over k divides n. Proof. First, we consider assertion (i). Suppose that φ is not injective. Since Aut k (A, λ) is well-known to be finite [cf., e.g., [Milne], Proposition 17.5, (a)], we thus conclude that there exists an α Ker(φ) of order n  = 1. We may assume 20 SHINICHI MOCHIZUKI without loss of generality that n is prime. Now we follow the argument of [Milne], Proposition 17.5, (b). Since α acts trivially on A[l], it follows immediately that the endomorphism of A given by α −id A [where id A denotes the identity automorphism of A] may be written in the form l · β, for β an endomorphism of A over k. Write T l (A) for the l-adic Tate module of A. Since α n = id A , it follows that the eigen- values of the action of α on T l (A) are n-th roots of unity. On the other hand, the eigenvalues of the action of β on T l (A) are algebraic integers [cf. [Milne], Theorem 12.5]. We thus conclude that each eigenvalue ζ of the action of α on T l (A) is an n-th root of unity which, as an algebraic integer, is 1 (mod l) [where l 3], hence = 1. Since α n = id A , it follows that α acts on T l (A) as a semi-simple matrix which is also unipotent, hence equal to the identity matrix. But this implies that α = id A [cf. [Milne], Theorem 12.5]. This contradiction completes the proof of assertion (i). Next, we consider assertion (ii). Since E k is proper over k, it follows [by considering the space of global sections of the structure sheaf of E k ] that any automorphism of the scheme E k lies over an automorphism of k. This implies the existence of a natural exact sequence and natural representation as in the statement of assertion (ii). The relationship between k E and the j-invariant of E k follows immediately from the well-known theory of the j-invariant of an elliptic curve [cf., e.g., [Silv], Chapter III, Proposition 1.4, (b), (c)]. The final portion of assertion (ii) concerning models of E k follows immediately from the definitions. This completes the proof of assertion (ii). Assertion (iii) follows immediately from the fact that {±1} acts trivially on E k [2]. Next, we consider assertion (iv). First, let us observe that it follows immedi- ately from the final portion of assertion (ii) that a model E k of E k over k all of whose l-torsion points are rational over k corresponds to a closed subgroup H Aut k (E k ) that lies in the kernel of ρ l and, moreover, maps isomorphically to G k . On the other hand, it follows from assertion (i) that the restriction of ρ l to Aut k (E k ) Aut k (E k ) is injective. Thus, the closed subgroup H Aut k (E k ) is uniquely determined by the condition that it lie in the kernel of ρ l and, moreover, map isomorphically to G k . This completes the proof of assertion (iv). Next, we consider assertion (v). First, let us observe that, by considering l- level structures, we obtain a finite covering of S (M ell ) Z[ 1 l ] which is étale over (M ell ) Z[ 1 l ] and tamely ramified over the divisor at infinity. Then it follows from assertion (i) that the algebraic stack S is in fact a scheme, which is, moreover, proper over Z[ 1 l ]. Thus, it follows from the valuative criterion for properness that any k-valued point of S determined by E k where we observe that such a point necessarily exists, in light of our assumption that the l-torsion points of E k are rational over k extends to an O k -valued point of S, hence also of M ell , as desired. This completes the proof of assertion (v). Next, we consider assertion (vi). Since G E = G k , it follows from assertion (ii) that E k descends to an elliptic curve E k over k. Our assumption that the representation G k = G E Aut(E k [2]) of assertion (iii) is trivial implies that the 2-torsion points of E k are rational over k. Thus, by considering suitable global sections of tensor powers of the line bundle on E k determined by the origin on which the automorphism “−1” of E k acts via multiplication by ±1 [cf., e.g., [Harts], Chapter IV, the proof of Proposition 4.6], one concludes immediately that a suitable INTER-UNIVERSAL TEICHMÜLLER THEORY IV 21 [possibly trivial] twist E k of E k over k [i.e., such that E k and E k are isomorphic over some quadratic extension k of k] may be defined by means of the Legendre form of the Weierstrass equation. Now suppose that k is a complete discrete valuation field with ring of integers O k such that 2 is invertible in O k , and that E k is defined by means of the Legendre form of the Weierstrass equation. Then the fact that E k has semi-stable reduction over O k  for some finite extension k k of k such that [k : k] 2 follows from the explicit computations of the proof of [Silv], Chapter VII, Proposition 5.4, (c). These explicit computations also imply that if E k has good reduction over O k  , then one may in fact take k to be k. This completes the proof of assertion (vi). Assertion (vii) follows immediately from [NerMod], §7.4, Theorem 5, in the case of good reduction and from [NerMod], §7.4, Theorem 6, in the case of bad multiplicative reduction. We are now ready to apply the elementary computations discussed above to give more explicit log-volume estimates for Θ-pilot objects. We begin by recalling some notation and terminology from [GenEll], §1. Definition 1.9. Let F be a number field [i.e., a finite extension of the ra- tional number field Q], whose set of valuations we denote by V(F ). Thus, V(F ) V(F ) arc of nonarchimedean decomposes as a disjoint union V(F ) = V(F ) non and archimedean valuations. If v V(F ), then we shall write F v for the completion of F at v; if v V(F ) non , then we shall write e v for the ramification index of F v over Q p v , f v for the residue field degree of F v over Q p v , and q v for the cardinality of the residue field of F v . (i) An [R-]arithmetic divisor a on F is defined to be a finite formal sum  c v · v v∈V(F ) where c v R, for all v V(F ). Here, we shall refer to the set Supp(a) of v V(F ) such that c v  = 0 as the support of a; if all of the c v are 0, then we shall say that the arithmetic divisor is effective. Thus, the [R-]arithmetic divisors on F naturally form a group ADiv R (F ). The assignment V(F ) non  v → log(q v ); V(F ) arc  v → 1 determines a homomorphism deg F : ADiv R (F ) R which we shall refer to as the degree map. If a ADiv R (F ), then we shall refer to deg(a) def = 1 · deg F (a) [F : Q] 22 SHINICHI MOCHIZUKI as the normalized degree of a. Thus, for any finite extension K of F , we have deg(a| K ) = deg(a) where we write deg(a| K ) for the normalized degree of the pull-back a| K ADiv R (K) [defined in the evident fashion] of a to K. def (ii) Let  v Q V Q = V(Q), E V(F ) a nonempty set of elements lying over v Q . If a = c v · v ADiv R (F ), then we shall write v∈V(F ) def a E =  c v · v ADiv R (F ); v∈E deg(a E ) def deg E (a) =  [F v : Q v Q ] v∈E for the portion of a supported in E and the “normalized E-degree” of a, respectively. Thus, for any finite extension K of F , we have deg E| K (a| K ) = deg E (a) where we write E| K V(K) for the set of valuations lying over valuations E. Theorem 1.10. (Log-volume Estimates for Θ-Pilot Objects) Fix a col- lection of initial Θ-data as in [IUTchI], Definition 3.1. Suppose that we are in the situation of [IUTchIII], Corollary 3.12, and elliptic curve E F has  that the good non V(F ) that does not divide good reduction at every valuation V(F ) def 2l. In the notation of [IUTchI], Definition 3.1, let us write d mod = [F mod : Q], (1 ≤) e mod (≤ d mod ) for the maximal ramification index of F mod [i.e., of valu- def def 12 3 12 3 ations V non mod ] over Q, d mod = 2 · 3 · 5 · d mod , e mod = 2 · 3 · 5 · e mod (≤ d mod ), and def F mod F tpd = F mod ( E F mod [2] ) F for the “tripodal” intermediate field obtained from F mod by adjoining the fields of definition of the 2-torsion points of any model of E F × F F over F mod [cf. Proposition 1.8, (ii), (iii)]. Moreover, we assume that the (3·5)-torsion points of E F are defined over F , and that F = F mod ( −1, E F mod [2 · 3 · 5] ) def = F tpd ( −1, E F tpd [3 · 5] ) i.e., that F is obtained from F tpd by adjoining −1, together with the fields of definition of the (3 · 5)-torsion points of a model E F tpd of the elliptic curve E F × F F over F tpd determined by the Legendre form of the Weierstrass equation [cf., e.g., the statement of Corollary 2.2, below; Proposition 1.8, (vi)]. [Thus, it follows from Proposition 1.8, (iv), that E F = E F tpd × F tpd F over F , and from [IUTchI], Definition 3.1, (c), that l  = 5.] If F mod F  K is any intermediate extension which is Galois over F mod , then we shall write F  d ADiv ADiv R (F  ) INTER-UNIVERSAL TEICHMÜLLER THEORY IV 23 for the effective arithmetic divisor determined by the different ideal of F  over Q, F  q ADiv ADiv R (F  ) for the effective arithmetic divisor determined by the q-parameters of the elliptic def curve E F at the elements of V(F  ) bad = V bad mod × V mod V(F  ) ( = ∅) [cf. [GenEll], Remark 3.3.1], F  f ADiv ADiv R (F  ) F  for the effective arithmetic divisor whose support coincides with Supp(q ADiv ), but all of whose coefficients are equal to 1 i.e., the conductor and F def F  log(d v  ) = deg V(F  ) v (d ADiv ) R ≥0 ; F def F  log(d v Q  ) = deg V(F  ) v (d ADiv ) R ≥0 Q def F  log(d F  ) = deg(d ADiv ) R ≥0 def F  log(q v ) = deg V(F  ) v (q ADiv ) R ≥0 ; def def F  log(q v Q ) = deg V(F  ) v (q ADiv ) R ≥0 Q F  log(q) = deg(q ADiv ) R ≥0 def F F  log(f v  ) = deg V(F  ) v (f ADiv ) R ≥0 ; def F def F  log(f v Q  ) = deg V(F  ) v (f ADiv ) R ≥0 Q F  log(f F  ) = deg(f ADiv ) R ≥0 def where v V mod = V(F mod ), v Q V Q = V(Q), V(F  ) v = V(F  ) × V mod {v}, def V(F  ) v Q = V(F  ) × V Q {v Q }. Here, we observe that the various “log(q (−) )’s” are independent of the choice of F  , and that the quantity “|log(q)| R >0 defined in 1 [IUTchIII], Corollary 3.12, is equal to 2l · log(q) R [cf. the definition of “q v in [IUTchI], Example 3.2, (iv)]. Then one may take the constant “C Θ R” of [IUTchIII], Corollary 3.12, to be  l+1 · (1 + 12·d l mod ) · (log(d F tpd ) + log(f F tpd )) + 10 · (e mod · l + η prm ) 4·|log(q)|  1 12 6 · (1 l 2 ) · log(q) 1 and hence, by applying the inequality “C Θ −1” of [IUTchIII], Corollary 3.12, conclude that 1 6 · log(q) (1 + 20·d l mod ) · (log(d F tpd ) + log(f F tpd )) + 20 · (e mod · l + η prm ) (1 + 20·d l mod ) · (log(d F ) + log(f F )) + 20 · (e mod · l + η prm ) where η prm is the positive real number of Proposition 1.6. Proof. For ease of reference, we divide our discussion into steps, as follows. (i) We begin by recalling the following elementary identities for n N ≥1 : (E1) (E2) 1 n 1 n n  m=1 n  m=1 m = 1 2 (n + 1); m 2 = 1 6 (2n + 1)(n + 1). 24 SHINICHI MOCHIZUKI Also, we recall the following elementary facts: (E3) For p a prime number, the cardinality |GL 2 (F p )| of GL 2 (F p ) is given by |GL 2 (F p )| = p(p + 1)(p 1) 2 . (E4) For p = 2, 3, 5, the expression of (E3) may be computed as follows: 2 = 3·2 4 ; 5(5+1)(5−1) 2 = 5·2 5 ·3. 2(2+1)(2−1) 2 = 2·3; 3(3+1)(3−1) (E5) The degree of the extension F mod ( −1 )/F mod is 2. (E6) We have: 0 log(2) 1, 1 log(3) log(π) log(5) 2. (ii) Next, let us observe that the inequality log(d F tpd ) + log(f F tpd ) log(d F ) + log(f F ) follows immediately from Proposition 1.3, (i), and the various definitions involved. On the other hand, the inequality log(d F ) + log(f F ) log(d F tpd ) + log(f F tpd ) + log(2 11 · 3 3 · 5 2 ) log(d F tpd ) + log(f F tpd ) + 21 follows by applying Proposition 1.3, (i), at the primes that do not divide 2 · 3 · 5 [where we recall that the extension F/F tpd is tamely ramified over such primes cf. Proposition 1.8, (vi), (vii)] and applying Proposition 1.3, (ii), together with (E3), (E4), (E5), (E6), and the fact that we have a natural outer inclusion Gal(F/F tpd ) → GL 2 (F 3 ) × GL 2 (F 5 ) × Z/2Z, at the primes that divide 2 · 3 · 5. In a similar vein, since the extension K/F is tamely ramified at the primes that do not divide l, and we have a natural outer inclusion Gal(K/F ) → GL 2 (F l ), the inequality log(d K ) log(d K ) + log(f K ) log(d F ) + log(f F ) + 2 · log(l) log(d F tpd ) + log(f F tpd ) + 2 · log(l) + 21 follows immediately from Proposition 1.3, (i), (ii). Finally, for later reference, we observe that (1 + 4 l ) · log(d K ) (1 + 4 l ) · (log(d F tpd ) + log(f F tpd )) + 2 · log(l) + 46 where we apply the estimates log(l) 12 and 1 + 4 l 2, both of which may be l regarded as consequences of the fact that l 5 [cf. also (E6)]. (iii) If F tpd F  K is any intermediate extension which is Galois over F mod , then we shall write V(F  ) dst V(F  ) non for the set of “distinguished” nonarchimedean valuations v V(F  ) non , i.e., v that extend to a valuation V(K) non that ramifies over Q. Now observe that it follows immediately from Proposition 1.8, (vi), (vii), together with our assumption  on V(F ) good V(F ) non , that (D0) if v V(F tpd ) non does not divide 2 · 3 · 5 · l and, moreover, is not contained F tpd in Supp(q ADiv ), then the extension K/F tpd is unramified over v. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 25 Next, let us recall the well-known fact that the determinant of the Galois rep- resentation determined by the torsion points of an elliptic curve over a field of characteristic zero is the abelian Galois representation determined by the cyclo- tomic character. In particular, it follows [cf. the various definitions involved] that K contains a primitive 4 · 3 · 5 · l-th root of unity, hence is ramified over Q at any valuation V(K) non that divides 2 · 3 · 5 · l. Thus, one verifies immediately [i.e., by applying (D0); cf. also [IUTchI], Definition 3.1, (c)] that the following conditions on a valuation v V(F  ) non are equivalent: (D1) v V(F  ) dst . F  F  (D2) The valuation v either divides 2 · 3 · 5 · l or lies in Supp(q ADiv + d ADiv ). (D3) The image of v in V(F tpd ) lies in V(F tpd ) dst . Let us write non V dst mod V mod ; V dst V non Q Q for the respective images of V(F tpd ) dst in V mod , V Q and, for F {F mod , Q} and v Q V Q ,  def = e v · v ADiv R (F ) s F ADiv v∈V(F ) dst def F log(s F v Q ) = deg V(F ) v (s ADiv ) R ≥0 ; Q  s ADiv = def w Q ∈V(Q) dst ι w Q log(p w Q ) · w Q log(s v Q ) = deg V(Q) v (s ADiv ) R ≥0 ; def Q def log(s F ) = deg(s F ADiv ) R ≥0 ADiv R (Q) log(s ) = deg(s ADiv ) R ≥0 def def def where we write V(F ) v Q = V(F ) × V Q {v Q }; we set ι w Q = 1 if p w Q e mod · l, def ι w Q = 0 if p w Q > e mod · l. Then one verifies immediately [again, by applying (D0); cf. also [IUTchI], Definition 3.1, (c)] that the following conditions on a valuation are equivalent: v Q V non Q (D4) v Q V dst Q . (D5) The valuation v Q ramifies in K. F tpd F tpd (D6) Either p v Q | 2 · 3 · 5 · l or v Q lies in the image of Supp(q ADiv + d ADiv ). F F (D7) Either p v Q | 2 · 3 · 5 · l or v Q lies in the image of Supp(q ADiv + d ADiv ). Here, we observe in passing that, for v V(F  ), (R1) log(e v ) log(2 11 · 3 3 · 5 · e mod · l 4 ) if v divides l, F  (R2) log(e v ) log(2 11 · 3 3 · 5 · e mod · l) if v divides 2 · 3 · 5 or lies in Supp(q ADiv ) [hence does not divide l], (R3) log(e v ) log(2 11 · 3 3 · 5 · e mod ) if v does not divide 2 · 3 · 5 · l and, F  moreover, is not contained in Supp(q ADiv ), and hence that 26 SHINICHI MOCHIZUKI (R4) if e v p v 1 > p v 2, then p v 2 12 · 3 3 · 5 · e mod · l = e mod · l, and log(e v ) −3 + 4 · log(e mod · l) cf. (E3), (E4), (E5), (E6); (D0); Proposition 1.8, (v), (vii); [IUTchI], Definition 3.1, (c). Next, for later reference, we observe that the inequality F mod 1 ) p v Q · log(s v Q 1 p v Q · log(p v Q ) holds for any v Q V Q ; in particular, when p v Q = l (≥ 5), it holds that F mod 1 ) p v Q · log(s v Q 1 p v Q · log(p v Q ) 1 2 cf. (E6). On the other hand, it follows immediately from Proposition 1.3, (i), mod by considering the various possibilities for elements Supp(s F ADiv ), that F F mod log(s F ) 2 · (log(d v Q tpd ) + log(f v Q tpd )) v Q and hence that F mod 1 ) p v Q · log(s v Q F tpd F tpd 2 p v Q · (log(d v Q ) + log(f v Q )) for any v Q V Q such that p v Q ∈ {2, 3, 5, l}. In a similar vein, we conclude that log(s Q ) 2 · d mod · (log(d F tpd ) + log(f F tpd )) + log(2 · 3 · 5 · l) 2 · d mod · (log(d F tpd ) + log(f F tpd )) + 5 + log(l) and hence that Q 4 l · log(s ) 8·d mod · (log(d F tpd ) + log(f F tpd )) + 6 l cf. (E6); the fact that l 5. Combining this last inequality with the inequality of the final display of Step (ii) yields the inequality (1 + 4 l ) · log(d K ) + 4 l · log(s Q ) (1 + 12·d l mod ) · (log(d F tpd ) + log(f F tpd )) + 2 · log(l) + 52 where we apply the estimate d mod 1. (iv) In order to estimate the constant “C Θ of [IUTchIII], Corollary 3.12, we must, according to the various definitions given in the statement of [IUTchIII], Corollary 3.12, compute an upper bound for the procession-normalized mono-analytic log-volume of the holomorphic hull of the union of the possible images of a Θ-pilot object, relative to the relevant Kummer isomorphisms [cf. [IUTchIII], Theorem 3.11, (ii)], in the multiradial representation of [IUTchIII], Theorem 3.11, (i), which we regard as subject to the indeterminacies (Ind1), (Ind2), (Ind3) described in [IUTchIII], Theorem 3.11, (i), (ii). INTER-UNIVERSAL TEICHMÜLLER THEORY IV 27 Thus, we proceed to estimate this log-volume at each v Q V Q . Once one fixes v Q , this amounts to estimating the component of this log-volume in ± “I Q ( S j+1 ;n,◦ D v  Q )” [cf. the notation of [IUTchIII], Theorem 3.11, (i), (a)], for each j {1, . . . , l  }, which we shall also regard as an element of F  l , and then computing the average,  over j {1, . . . , l }, of these estimates. Here, we recall [cf. [IUTchI], Proposition 6.9, (i); [IUTchIII], Proposition 3.4, (ii)] that S ± j+1 = {0, 1, . . . , j}. Also, we recall ± from [IUTchIII], Proposition 3.2, that “I Q ( S j+1 ;n,◦ D v  Q )” is, by definition, a tensor product of j + 1 copies, indexed by the elements of S ± j+1 , of the direct sum of the Q- spans of the log-shells associated to each of the elements of V(F mod ) v Q [cf., especially, the second and third displays of [IUTchIII], Proposition 3.2]. In particular, for each collection {v i } i∈S ± j+1 of [not necessarily distinct!] elements of V(F mod ) v Q , we must estimate the com- ponent of the log-volume in question corresponding to the tensor product of the Q-spans of the log-shells associated to this collection {v i } i∈S ± and then compute j+1 the weighted average [cf. the discussion of Remark 1.7.1], over possible collections {v i } i∈S ± , of these estimates. j+1 (v) Let v Q V dst Q . Fix j, {v i } i∈S ± j+1 as in Step (iv). Write v i V V mod = V(F mod ) for the element corresponding to v i . We would like to apply Proposition 1.4, (iii), to the present situation, by taking · “I” to be S ± j+1 ; · “I I” to be the set of i I such that e v i > p v Q 2; · “k i to be K v i [so “R i will be the ring of integers O K vi of K v i ]; · “i to be j S ± j+1 ; · “λ” to be 0 if v j V good ; 2 · “λ” to be “ord(−)” of the element q j [cf. the definition of “q in [IUTchI], Example 3.2, (iv)] if v j V v j bad v . Thus, the inclusion “φ(p λ · (R I ) ) p λ−d I −a I  · log p (R I × )” of Proposition 1.4, (iii), implies that the result of multiplying “p λ−|I| · 2 −|I| · log p (R I × )” by a suitable nonpositive [cf. the inequalities concerning “d I +a I that constitute the final portion of Proposition 1.4, (iii)] integer power of p v Q contains the “union of possible images of a Θ-pilot object” discussed in Step (iv). That is to say, the indeterminacies (Ind1) and (Ind2) are taken into account by the arbitrary nature of the automorphism “φ” [cf. Proposition 1.2], while the indeterminacy (Ind3) is taken into account by the fact that we are considering upper bounds [cf. the discussion of Step (x) of the proof of [IUTchIII], Corollary 3.12], together with the fact that the above-mentioned integer power of p v Q is nonpositive, which implies that the module obtained by multiplying by this power of p v Q contains “p λ−|I| · 2 −|I| · log p (R I × )”. Thus, an upper bound on the component of the log-volume of the holomorphic hull under 28 SHINICHI MOCHIZUKI consideration may be obtained by computing an upper bound for the log-volume of the right-hand side of the inclusion “p λ−d I −a I  ·log p (R I × ) p λ−d I −a I −b I ·(R I ) of Proposition 1.4, (iii). Such an upper bound    λ + d I + 1 · log(p) + {3 + log(e i )}” i∈I is given in the second displayed inequality of Proposition 1.4, (iii). Here, we note that if e v i p v Q 2 for all i I, then this upper bound assumes the form   λ + d I + 1 · log(p)”. On the other hand, by (R4), if e v i > p v Q 2 for some i I, then it follows that p v Q e mod · l, and log(e v i ) −3 + 4 · log(e mod · l), so the upper bound in question may be taken to be   λ + d I + 1 · log(p) + 4(j + 1) · l mod def where we write l mod = log(e mod · l). Also, we note that, unlike the other terms that appear in these upper bounds, “λ” is asymmetric with respect to the choice of “i I” in S ± j+1 . Since we would like to compute weighted averages [cf. the discussion of Remark 1.7.1], we thus observe that, after symmetrizing with respect to the choice of “i I” in S ± j+1 , this upper bound may be written in the form “β e [cf. the notation of Proposition 1.7] if, in the situation of Proposition 1.7, one takes · “E” to be V(F mod ) v Q ; · “n” to be j + 1, so an element e E n corresponds precisely to a collection {v i } i∈S ± ; j+1 · “λ e ”, for an element e E corresponding to v V(F mod ) = V mod , to be [(F mod ) v : Q v Q ] R >0 ; · “β e ”, for an element e E corresponding to v V(F mod ) = V mod , to be 2 j 1 log(d K v ) 2l(j+1) · log(q v ) + j+1 · log(p v Q ) + 4 · ι v Q · l mod def def where we recall that ι v Q = 1 if p v Q e mod · l, ι v Q = 0 if p v Q > e mod · l. Here, we note that it follows immediately from the first equality of the first dis- play of Proposition 1.7 that, after passing to weighted averages, the operation of symmetrizing with respect to the choice of “i I” in S ± j+1 does not affect the com- putation of the upper bound under consideration. Thus, by applying Proposition 1.7, we obtain that the resulting “weighted average upper bound” is given by 2 j Q (j + 1) · log(d K v Q ) 2l · log(q v Q ) + log(s v Q ) + 4(j + 1) · l mod · log(s v Q ) INTER-UNIVERSAL TEICHMÜLLER THEORY IV 29 where we recall the notational conventions introduced in Step (iii). Thus, it  remains to compute the average over j F  l . By averaging over j {1, . . . , l = l−1 2 } and applying (E1), (E2), we obtain the “procession-normalized upper bound” (l  +3) (2l  +1)(l  +1)  · log(d K · log(q v Q ) + log(s Q v Q ) v Q ) + 2(l + 3) · l mod · log(s v Q ) 2 12l K Q l+1 = l+5 4 · log(d v Q ) 24 · log(q v Q ) + log(s v Q ) + (l + 5) · l mod · log(s v Q ) l+1 4 ·  Q 1 4 20 (1 + 4 l ) · log(d K v Q ) 6 · log(q v Q ) + l · log(s v Q ) + 3 · l mod · log(s v Q ) where, in the passage to the final displayed inequality, we apply the estimates 4(l+5) 1 1 20 l+1 l and l+1 3 , both of which may be regarded as consequences of the fact that l 5. \ V dst (vi) Next, let v Q V non Q Q . Fix j, {v i } i∈S ± as in Step (iv). Write j+1 v i V V mod = V(F mod ) for the element corresponding to v i . We would like to apply Proposition 1.4, (iv), to the present situation, by taking · “I” to be S ± j+1 ; · “k i to be K v i [so “R i will be the ring of integers O K vi of K v i ]. dst Here, we note that our assumption that v Q V non Q \V Q implies that the hypotheses of Proposition 1.4, (iv), are satisfied. Thus, the inclusion “φ((R I ) ) (R I ) of Proposition 1.4, (iv), implies that the tensor product of log-shells under consider- ation contains the “union of possible images of a Θ-pilot object” discussed in Step (iv). That is to say, the indeterminacies (Ind1) and (Ind2) are taken into account by the arbitrary nature of the automorphism “φ” [cf. Proposition 1.2], while the indeterminacy (Ind3) is taken into account by the fact that we are considering upper bounds [cf. the discussion of Step (x) of the proof of [IUTchIII], Corollary 3.12], together with the fact that the “container of possible images” is precisely equal to the tensor product of log-shells under consideration. Thus, an upper bound on the component of the log-volume under consideration may be obtained by computing an upper bound for the log-volume of the right-hand side “(R I ) of the above inclusion. Such an upper bound “0” is given in the final equality of Proposition 1.4, (iv). One may then compute a “weighted average upper bound” and then a “procession-normalized upper bound”, as was done in Step (v). The resulting “procession-normalized upper bound” is clearly equal to 0. (vii) Next, let v Q V arc Q . Fix j, {v i } i∈S ± j+1 as in Step (iv). Write v i V V mod = V(F mod ) for the element corresponding to v i . We would like to apply Proposition 1.5, (iii), (iv), to the present situation, by taking · “I” to be S ± j+1 [so |I| = j + 1]; · “V to be V(F mod ) v Q ; 30 SHINICHI MOCHIZUKI · “C v to be K v , where we write v V V mod for the element determined by v V . Then it follows from Proposition 1.5, (iii), (iv), that π j+1 · B I serves as a container for the “union of possible images of a Θ-pilot object” discussed in Step (iv). That is to say, the indeterminacies (Ind1) and (Ind2) are taken into account by the fact that B I M I is preserved by arbitrary automorphisms of the type discussed in Proposition 1.5, (iii), while the indeterminacy (Ind3) is taken into account by the fact that we are considering upper bounds [cf. the discussion of Step (x) of the proof of [IUTchIII], Corollary 3.12], together with the fact that, by Proposition 1.5, (iv), together with our choice of the factor π j+1 , this “container of possible images” contains the elements of M I obtained by forming the tensor prod- uct of elements of the log-shells under consideration. Thus, an upper bound on the component of the log-volume under consideration may be obtained by computing an upper bound for the log-volume of this container. Such an upper bound (j + 1) · log(π) follows immediately from the fact that [in order to ensure compatibility with arith- metic degrees of arithmetic line bundles cf. [IUTchIII], Proposition 3.9, (iii) one is obliged to adopt normalizations which imply that] the log-volume of B I is equal to 0. One may then compute a “weighted average upper bound” and then a “procession-normalized upper bound”, as was done in Step (v). The resulting “procession-normalized upper bound” is given by l+5 4 · log(π) l+1 4 · 4 cf. (E1), (E6); the fact that l 5. (viii) Now we return to the discussion of Step (iv). In order to compute the desired upper bound for “C Θ ”, it suffices to sum over v Q V Q the various local “procession-normalized upper bounds” obtained in Steps (v), (vi), (vii) for v Q V Q . By applying the inequality of the final display of Step (iii), we thus obtain 1 the following upper bound for “C Θ ·|log(q)|”, i.e., the product of “C Θ and 2l ·log(q): l+1 4 · (1 + 12·d l mod ) · (log(d F tpd ) + log(f F tpd )) + 2 · log(l) + 56 16 · (1 12 l 2 ) · log(q)  + 20 3 · l mod · log(s ) 1 2l · log(q) 1 12 1 where we apply the estimate l+1 4 · 6 · l 2 2l [cf. the fact that l 1]. Now let us recall the constant “η prm of Proposition 1.6. By applying Propo- sition 1.6, we compute:  e mod ·l l mod · log(s ) log(e mod · l) · 1 43 · log(e mod · l) · log(e ·l) mod p e ·l mod = 4 3 · e mod · l INTER-UNIVERSAL TEICHMÜLLER THEORY IV 31 where the sum ranges over the primes p e mod · l if e mod · l η prm ; l mod · log(s ) log(e mod · l) ·  1 p e ·l mod = η prm 4 3 · log(η prm ) · log(η prm ) 4 3 · η prm where the sum ranges over the primes p e mod · l if e mod · l < η prm . Thus, we conclude that l mod · log(s ) 43 · (e mod · l + η prm ) [i.e., regardless of the size of e mod · l]. Also, let us observe that 1 4 3 · 3 · (e mod · l + η prm ) 1 4 3 · 3 · e mod · l 2 · 2 · 2 12 · 3 · 5 · l 2 · log(l) + 56 where we apply the estimates e mod 1, 2 12 · 3 · 5 56, l 5 1, l log(l) [cf. the fact that l 5]. Thus, substituting back into our original upper bound for “C Θ · |log(q)|”, we obtain the following upper bound for “C Θ ”: l+1 4·|log(q)| ·  (1 + 12·d l mod ) · (log(d F tpd ) + log(f F tpd )) + 10 · (e mod · l + η prm )  ) · log(q) 1 16 · (1 12 2 l · 43 = 7·4 where we apply the estimate 20+1 3 3 10 i.e., as asserted in the statement of Theorem 1.10. The final portion of Theorem 1.10 follows immediately from [IUTchIII], Corollary 3.12, by applying the inequality of the first display of Step (ii), together with the estimates −1 2; (1 12 l 2 ) −1 (1 12 · (1 + 12·d l mod ) 1 + 20·d l mod l 2 ) [cf. the fact that l 7, d mod 1]. Remark 1.10.1. One of the main original motivations for the development of the theory discussed in the present series of papers was to create a framework, or geometry, within which a suitable analogue of the scheme-theoretic Hodge-Arakelov theory of [HASurI], [HASurII] could be realized in such a way that the obstruc- tions to diophantine applications that arose in the scheme-theoretic formulation of [HASurI], [HASurII] [cf. the discussion of [HASurI], §1.5.1; [HASurII], Remark 3.7] could be avoided. From this point of view, it is of interest to observe that the com- putation of the “leading term” of the inequality of the final display of the statement of Theorem 1.10 i.e., of the term (l  +3) (2l  +1)(l  +1) · log(d K · log(q v Q ) v Q ) 2 12l that occurs in the final display of Step (v) of the proof of Theorem 1.10 via the identities (E1), (E2) is essentially identical to the computation of the leading term that occurs in the proof of [HASurI], Theorem A [cf. the discussion following the statement of Theorem A in [HASurI], §1.1]. That is to say, in some sense, 32 SHINICHI MOCHIZUKI the computations performed in the proof of Theorem 1.10 were already essentially known to the author around the year 2000; the problem then was to construct an appropriate framework, or geometry, in which these computations could be performed! This sort of situation may be compared to the computations underlying the Weil Conjectures priori to the construction of a “Weil cohomology” in which those computations could be performed, or, alternatively, to various computations of invariants in topology or differential geometry that were motivated by computations in physics, again prior to the construction of a suitable mathematical framework in which those computations could be performed. Remark 1.10.2. The computation performed in the proof of Theorem 1.10 may be thought of as the computation of a sort of derivative in the F  l -direction, which, relative to the analogy between the theory of the present series of papers and the p-adic Teichmüller theory of [pOrd], [pTeich], corresponds to the derivative of the canonical Frobenius lifting cf. the discussion of [IUTchIII], Remark 3.12.4, (iii). In this context, it is useful to recall the arithmetic Kodaira-Spencer morphism that occurs in scheme-theoretic Hodge-Arakelov theory [cf. [HASurII], §3]. In particular, in [HASurII], Corollary 3.6, it is shown that, when suitably formulated, a “certain portion” of this arithmetic Kodaira-Spencer morphism coincides with the usual geometric Kodaira-Spencer morphism. From the point of view of the action of GL 2 (F l ) on the l-torsion points involved, this “certain portion” consists of the unipotent matrices   1 0 1 of GL 2 (F l ). By contrast, the F  l -symmetries that occur in the present series of papers correspond to the toral matrices   0 0 of GL 2 (F l ) cf. the discussion of [IUTchI], Example 4.3, (i). As we shall see in §2 below, in the present series of papers, we shall ultimately take l to be “large”. When l is “sufficiently large”, GL 2 (F l ) may be thought of as a “good approximation” for GL 2 (Z) or GL 2 (R) cf. the discussion of [IUTchI], Remark 6.12.3, (i), (iii). In the case of GL 2 (R), “toral subgroups” may be thought of as corresponding to the isotropy subgroups [isomorphic to S 1 ] of points that arise from the action of GL 2 (R) on the upper half-plane, i.e., subgroups which may be thought of as a sort of geometric, group-theoretic representation of tangent vectors at a point. Remark 1.10.3. The “terms involving l” that occur in the inequality of the final display of Theorem 1.10 may be thought of as an inevitable consequence of the fundamental role played in the theory of the present series of papers by the l-torsion points of the elliptic curve under consideration. Here, we note that it is of crucial importance to work over the field of rationality of the l-torsion points [i.e., “K” as opposed to “F ”] not only when considering the global portions of the various ΘNF- INTER-UNIVERSAL TEICHMÜLLER THEORY IV 33 and Θ ±ell -Hodge theaters involved, but also when considering the local portions i.e., the prime-strips of these ΘNF- and Θ ±ell -Hodge theaters. That is to say, these local portions are necessary, for instance, in order to glue together the ΘNF- and Θ ±ell -Hodge theaters that appear so as to form a Θ ±ell NF-Hodge theater [cf. the discussion of [IUTchI], Remark 6.12.2]. In particular, to allow, within these local portions, any sort of “Galois indeterminacy” with respect to the l-torsion good  non V , which, at first glance, might points even, for instance, at v V appear irrelevant to the theory of Hodge-Arakelov-theoretic evaluation at l-torsion points developed in [IUTchII] would have the effect of invalidating the various delicate manipulations involving l-torsion points discussed in [IUTchI], §4, §6 [cf., e.g., [IUTchI], Propositions 4.7, 6.5]. Remark 1.10.4. The various fluctuations in log-volume i.e., whose computa- tion is the subject of Theorem 1.10! that arise from the multiradial representation of [IUTchIII], Theorem 3.11, (i), may be thought of as a sort of “inter-universal analytic torsion”. Indeed, in general, “analytic torsion” may be understood as a sort of measure in “metrized” [e.g., log-volume!] terms of the degree of deviation of the “holomorphic functions” [such as sections of a line bundle] on a variety i.e., which depend, in an essential way, on the holomorphic moduli of the variety! from the “real analytic functions” i.e., which are invariant with respect to deformations of the holomorphic moduli of the variety. For instance: (a) In “classical” Arakelov theory, analytic torsion typically arises as [the log- arithm of] a sort of normalized determinant of the Laplacian acting on some space of real analytic [or L 2 -] sections of a line bundle on a complex variety equipped with a real analytic Kähler metric [cf., e.g., [Arak], Chapters V, VI]. Here, we recall that in this sort of situation, the space of holomorphic sections of the line bundle is given by the kernel of the Laplacian; the definition of the Laplacian depends, in an essential way, on the Kähler metric, hence, in particular, on the holomorphic moduli of the variety under consideration [cf., e.g., the case of the Poincaré metric on a hyperbolic Riemann surface!]. (b) In the scheme-theoretic Hodge-Arakelov theory discussed in [HASurI], [HA- SurII], the main theorem consists of a sort of comparison isomorphism [cf. [HASurI], Theorem A] between a certain subspace of the space of global sections of the pull- back of an ample line bundle on an elliptic curve to the universal vectorial extension of the elliptic curve and the space of set-theoretic functions on the torsion points of the elliptic curve. That is to say, the former space of sections contains, in a natu- ral way, the space of holomorphic sections of the ample line bundle on the elliptic curve, while the latter space of functions may be thought of as a sort of “discrete approximation” of the space of real analytic functions on the elliptic curve [cf. the discussion of [HASurI], §1.3.2, §1.3.4]. In this context, the “Gaussian poles” [cf. the discussion of [HASurI], §1.1] arise as a measure of the discrepancy of integral structures between these two spaces in a neighborhood of the divisor at infinity of 34 SHINICHI MOCHIZUKI the moduli stack of elliptic curves, hence may be thought of as a sort of “analytic torsion at the divisor at infinity” [cf. the discussion of [HASurI], §1.2]. (c) In the case of the multiradial representation of [IUTchIII], Theorem 3.11, (i), the fluctuations of log-volume computed in Theorem 1.10 arise precisely as a result of the execution of a comparison of an “alien” arithmetic holomorphic structure to this multiradial representation, which is compatible with the per- mutation symmetries of the étale-picture, i.e., which is “invariant with respect to deformations of the arithmetic holomorphic moduli of the number field under con- sideration” in the sense that it makes sense simultaneously with respect to distinct arithmetic holomorphic structures [cf. [IUTchIII], Remark 3.11.1; [IUTchIII], Re- mark 3.12.3, (ii)]. Here, it is of interest to observe that the object of this comparison consists of the values of the theta function, i.e., in essence, a “holomorphic section of an ample line bundle”. In particular, the resulting fluctuations of log-volume may be thought as a sort of “analytic torsion”. By analogy to the terminology “Gaussian poles” discussed in (b) above, it is natural to think of the terms involv- ing the different d K (−) that appear in the computation underlying Theorem 1.10 [cf., e.g., the final display of Step (v) of the proof of Theorem 1.10] as “differential poles” [cf. the discussion of Remarks 1.10.1, 1.10.2]. Finally, in the context of the normalized determinants that appear in (a), it is interesting to note the role played by the prime number theorem i.e., in essence, the Riemann zeta func- tion [cf. Proposition 1.6 and its proof] in the computation of “inter-universal analytic torsion” given in the proof of Theorem 1.10. Remark 1.10.5. The above remarks focused on the conceptual aspects of the theory surrounding Theorem 1.10. Before proceeding, however, we pause to discuss briefly certain aspects of Theorem 1.10 that are of interest from a computational point of view, i.e., in the spirit of conventional analytic number theory. (i) First, we begin by observing that, unlike the inequalities that appear in the various results [cf. Corollaries 2.2, (ii); 2.3] obtained in §2 below, the inequalities obtained in Theorem 1.10 involve only essentially explicit constants and, more- over, do not require one to exclude some non-explicit finite set of “isomorphism classes of exceptional elliptic curves”. From this point of view, the inequalities obtained in Theorem 1.10 are suited to application to computations concerning various explicit diophantine equations, such as, for instance, the equations that appear in “Fermat’s Last Theo- rem”. Such explicit computations in the case of specific diophantine equations are, how- ever, beyond the scope of the present paper. (ii) One topic of interest in the context of computational aspects of Theorem 1.10 is the asymptotic behavior of the bound that appears in, say, the first inequality of the final display of Theorem 1.10. Let us assume, for simplicity, def that F tpd = Q [so d mod = 1]. Also, to simplify the notation, let us write δ = log(d F tpd ) + log(f F tpd ) = log(f F tpd ). Then the bound under consideration assumes the form δ + · δl + · l + INTER-UNIVERSAL TEICHMÜLLER THEORY IV 35 where, in the present discussion, the “∗’s” are to be understood as denoting fixed positive real numbers. Thus, the leading term [cf. the discussion of Remark 1.10.1] is equal to δ. The remaining terms give rise to the “ terms” [and bounded discrep- ancy] of the inequalities of Corollaries 2.2, (ii); 2.3, obtained in §2 below. Thus, if one ignores “bounded discrepancies”, it is of interest to consider the behavior of the “ terms” · δl + · l as one allows the initial Θ-data under consideration to vary [i.e., subject to the condition “F tpd = Q”]. In this context, one fundamental observation is the fol- lowing: although l is subject to various other conditions, no matter how “skillfully” one chooses l, the resulting “ terms” are always · δ 1/2 an estimate that may be obtained by thinking of l as δ α , for some real number α, and comparing δ α and δ 1−α . This estimate is of particular interest in the context of various explicit examples constructed by Masser and others [cf. [Mss]; the discussion of [vFr], §2] in which explicit “abc sums” are constructed for which the quantity on the left-hand side of the inequality of Theorem 1.10 under consideration exceeds the order of δ + 1/2 δ · log(δ) cf. [vFr], Equation (6). In particular, the asymptotic estimates given by Theorem 1.10 are consistent with the known asymptotic behavior of these explicit abc sums. Indeed, the exponent 12 that appears in the fundamental observation discussed above coincides precisely with the “expectation” expressed by van Frankenhuijsen in the final portion of the discussion of [vFr], §2! In the present paper, although we are unable to in fact achieve bounds on the “ terms” of the order · δ 1/2 , we do succeed in obtaining bounds on the “ terms” of the order · δ 1/2 · log(δ) albeit under the assumption that the abc sums under consideration are com- pactly bounded away from infinity at the prime 2, as well as at the archimedean prime [cf. Corollary 2.2, (ii); Remark 2.2.1 below for more details]. (iii) In the context of the discussion of (ii), it is of interest to observe that the “∗ · l” portion of the “ terms” that appear arises from the estimates given in Step (viii) of the proof of Theorem 1.10 for the quantity “log(s )”. From the point of view of the discussion of [vFr], §3, this quantity corresponds essentially to a “certain portion” of the quantity “ω(abc)” associated to an abc sum. That is to say, whereas “ω(abc)” denotes the total number of prime factors that occur in the product abc, the quantity “log(s )” corresponds, roughly speaking, to the number of these prime factors that are e mod · l. The appearance [i.e., in the proof of Theorem 1.10] of such a term which is closely related to “ω(abc)” is of interest from the point of view of the discussion of [vFr], §3, partly since it is [not precisely identical to, but nonetheless] reminiscent of the various refinements of the ABC Conjecture proposed by Baker [i.e., which are the main topic of the discussion of 36 SHINICHI MOCHIZUKI [vFr], §3]. The appearance [i.e., in the proof of Theorem 1.10] of such a term which is closely related to “ω(abc)” is also of interest from the point of view of the explicit δ 1/2 abc sums discussed in (ii) that give rise to asymptotic behavior · log(δ) . That is to say, according to the discussion of [vFr], §3, Remark 1, this sort of abc sum tends to give rise to a relatively large value for ω(abc) i.e., a state of affairs that is con- sistent with the crucial role played by the “ term” related to ω(abc) in the computation of the lower bound “≥ · δ 1/2 that appears in the fundamental observation of (ii). By contrast, the abc sums of the form “2 n = p + qr” [where p, q, and r are prime numbers] considered in [vFr], §3, Remark 1, give rise to a relatively small value for ω(abc) [indeed, ω(abc) 4] i.e., a situation that suggests relatively small/essentially negligible “ terms” in the bound of Theorem 1.10 under consideration. Such essentially negligible “ terms” are, however, consistent with the fact [cf. [vFr], §3, Remark 1] that, for such abc sums, the left-hand side of the inequality of Theorem 1.10 under consideration is roughly 12 · the leading term of the bound on the right-hand side, hence, in particular, is amply bounded by the leading term on the right-hand side, without any “help” from the “ terms”. Remark 1.10.6. (i) In the context of the discussion of Remark 1.10.5, it is important to remem- ber that the bound on 16 · log(q)” given in Theorem 1.10 only concerns the q- parameters at the nonarchimedean valuations contained in V bad mod , all of which are necessarily of odd residue characteristic cf. [IUTchI], Definition 3.1, (b). This observation is of relevance to the examples of abc sums constructed in [Mss] [cf. the discussion of Remark 1.10.5, (ii)], since it does not appear, at first glance, that there is any way to effectively control the contributions at the prime 2 in these examples, that is to say, in the notation of the Proposition of [Mss], to control the power of 2 that divides the integer “c” of the Proposition of [Mss], or, alternatively, in the notation of the proof of this Proposition on [Mss], p. 22, to control the power of 2 that divides the difference “x i x i−1 ”. On the other hand, it was pointed out to the author by A. Venkatesh that in fact it is not difficult to modify the construction of these examples of abc sums given in [Mss] so as to obtain similar asymptotic estimates to those obtained in [Mss] [cf. the discussion of Remark 1.10.5, (ii)], even without taking into account the contributions at the prime 2. (ii) In the context of the discussion of (i), it is of interest to recall why nonar- chimedean primes of even residue characteristic where the elliptic curve under INTER-UNIVERSAL TEICHMÜLLER THEORY IV 37 consideration has bad multiplicative reduction are excluded from V bad mod in the the- ory of the present series of papers. In a word, the reason that the theory encounters difficulties at primes over 2 is that it depends, in a quite essential way, on the theory of the étale theta function developed in [EtTh], which fails at primes over 2 [cf. the assumption that “p is odd” in [EtTh], Theorem 1.10, (iii); [EtTh], Definition 2.5; [EtTh], Corollary 2.18]. From the point of view of the theory of [IUTchI], [IUTchII], and [IUTchIII] [cf., especially, the theory of [IUTchII], §1, §2: [IUTchII], Corollary 1.12; [IUTchII], Corollary 2.4, (ii), (iii); [IUTchII], Corollary 2.6], one of the key consequences of the theory of [EtTh] is the simultaneous multiradiality of the algorithms that give rise to (1) constant multiple rigidity and (2) cyclotomic rigidity. At a more concrete level, (1) is obtained by evaluating the usual series for the theta function [cf. [EtTh], Proposition 1.4] at the 2-torsion point in the “irreducible component labeled zero”. One computes easily that the resulting “special value” is a unit for odd p, but is equal to a [nonzero] non-unit when p = 2. In particular, since (1) is established by dividing the series of [EtTh], Proposition 1.4 [i.e., the usual series for the theta function], by this special value, it follows that (a) the “integral structure” on the theta function determined by this special value coincides with (b) the “integral structure” on the theta function determined by the natural integral structure on the pole at the origin for odd p [cf. [EtTh], Theorem 1.10, (iii)], but not when p = 2. That is to say, when p = 2, a nontrivial denominator arises. Here, we recall that it is crucial to evaluate at 2-torsion points, i.e., as opposed to, say, more general points in the irreducible component labeled zero for reasons discussed in [IUTchII], Remark 2.5.1, (ii) [cf. also the discussion of [IUTchII], Remark 1.12.2, (i), (ii), (iii), (iv)]. This nontrivial denominator is fundamentally incompatible with the multiradiality of the algorithms of (1), (2) in that it is incompatible with the fundamental splitting, or “decoupling”, into “purely radial” [i.e., roughly speaking, “value group”] and “purely coric” [i.e., roughly speaking, “unit”] components discussed in [IUTchII], Remarks 1.11.4, (i); 1.12.2, (vi) [cf. also the discussion of [IUTchII], Remark 1.11.5]. That is to say, on the one hand, the multiradiality of (1) may only be established if the possible values at the evaluation points in the irreducible component labeled zero are known, a priori, to be units, i.e., if one works relative to the integral structure (a) cf. the discussion of [IUTchII], Remark 1.12.2, (i), (ii), (iii), (iv). On the other hand, if one tries to work 38 SHINICHI MOCHIZUKI simultaneously with the integral structure (b), hence with the non- trivial denominator discussed above, then the multiradiality of (2) is violated. Here, we recall that the integral structure (b), which is referred to as the “canonical integral structure” in [EtTh], Proposition 1.4, (iii); [EtTh], Theorem 1.10, (iii), is in some sense the “integral structure of common sense”. (iii) It is not entirely clear to the author at the time of writing to what extent the integral structure (b) is necessary in order to carry out the theory developed in the present series of papers. Indeed, [EtTh], as well as the present series of papers, was written in a way that [unlike the discussion of (ii)!] “takes for granted” the fact that the two integral structures (a), (b) discussed above coincide for odd p, i.e., in a way which identifies these two integral structures and hence does not specify, at various key points in the discussion, whether one is in fact working with integral structure (a) or with integral structure (b). On the other hand, if it is indeed the case that not only the integral structure (a), but also the integral struc- ture (b) plays an essential role in the present series of papers, then it follows [cf. the discussion of (ii)!] that the theory of the present series of papers is funda- mentally incompatible with the inclusion in V bad mod of nonarchimedean primes of even residue characteristic where the elliptic curve under consideration has bad multiplicative reduction. (iv) In the context of the discussion of (ii), (iii), it is perhaps useful to recall that the classical theory of theta functions also tends to [depending on your point of view!] “break down” or “assume a completely different form” at the prime 2. For instance, this phenomenon can be seen throughout Mumford’s theory of algebraic theta functions, which may be thought of as a sort of predecessor to the scheme- theoretic Hodge-Arakelov theory of [HASurI], [HASurII], which, in turn, may be thought of as a sort of predecessor to the theory of the present series of papers. In a similar vein, it is of interest to recall that the prime 2 is also excluded in the p-adic Teichmüller theory of [pOrd], [pTeich]. This is done in order to avoid the complications that occur in the theory of the Lie algebra sl 2 over fields of characteristic 2. Remark 1.10.7. (i) Since e mod d mod , one may replace “e mod by “d mod in the final two displays of the statement of Theorem 1.10. (ii) By contrast, at least if one adheres to the framework of the theory of the present series of papers, it is not possible to replace “d mod by “e mod in the final two displays of the statement of Theorem 1.10. The fundamental reason for this is that, in the construction of the multiradial representation of [IUTchIII], Theorem 3.11, (i), it is necessary to consider tensor products of copies, labeled by j F  l , of F mod over Q [cf. [IUTchIII], Proposition INTER-UNIVERSAL TEICHMÜLLER THEORY IV 39 3.3!]. That is to say, it is fundamentally impossible [i.e., relative to the framework of the theory of the present series of papers] to identify the F mod -linear structures for distinct labels j, since the various tensor packets that appear in the multiradial representation must be constructed in such a way as to depend only on the additive structure [i.e., not the module structure over some sort of ring such as F mod !] of the [mono-analytic!] log-shells involved. Working with tensor powers of copies of F mod over Q means that there is no way to avoid, when one localizes at a prime number p, working with tensor products between localizations of F mod at distinct primes of F mod that divide p. Moreover, whenever even one of these primes of F mod is lies under a prime of K that ramifies over Q [cf. condition (D5) of Step (iii) of the proof of Theorem 1.10], the computation of Step (v) of the proof of Theorem 1.10 necessarily gives rise to a “log(p)” term i.e., that appears in “log(s Q )” that arises from “rounding up” non-integral powers of p [i.e., as in the inclusions of Proposition 1.4, (iii)], since only integral powers of p make sense in the multiradial representation. That is to say, whereas integral powers of p only require the use of the additive structure of the [mono-analytic!] log-shells involved, non-integral powers only make sense if one is equipped with the module structure over some sort of ring such as F mod ! 40 SHINICHI MOCHIZUKI Section 2: Diophantine Inequalities In the present §2, we combine Theorem 1.10 with the theory of [GenEll] to give a proof of the ABC Conjecture, or, equivalently, Vojta’s Conjecture for hyperbolic curves [cf. Corollary 2.3 below]. We begin by reviewing some well-known estimates. Proposition 2.1. (Well-known Estimates) (i) (Linearization of Logarithms) We have log(x) x for all (R ) x 1. (ii) (The Prime Number Theorem) There exists a real number ξ prm 5 such that  def 2 · x θ(x) = log(p) 43 · x 3 p≤x where the sum ranges over the prime numbers p such that p x for all (R ) x ξ prm . In particular, if A is a finite set of prime numbers, and we write def θ A =  log(p) p∈A [where we take the sum to be 0 if A = ∅], then there exists a prime number p ∈ A such that p 2(θ A + ξ prm ). Proof. Assertion (i) is well-known and entirely elementary. Assertion (ii) is a well- known consequence of the Prime Number Theorem [cf., e.g., [Edw], p. 76; [GenEll], Lemma 4.1; [GenEll], Remark 4.1.1]. Let Q be an algebraic closure of Q. In the following discussion, we shall apply the notation and terminology of [GenEll]. Let X be a smooth, proper, geometrically def connected curve over a number field; D X a reduced divisor; U X = X\D; d a positive integer. Write ω X for the canonical sheaf on X. Suppose that U X is a hyperbolic curve, i.e., that the degree of the line bundle ω X (D) is positive. Then we recall the following notation: · U X (Q) ≤d U X (Q) denotes the subset of Q-rational points defined over a finite extension field of Q of degree d [cf. [GenEll], Example 1.3, (i)]. · log-diff X denotes the [normalized] log-different function on U X (Q) [cf. [GenEll], Definition 1.5, (iii)]. · log-cond D denotes the [normalized] log-conductor function on U X (Q) [cf. [GenEll], Definition 1.5, (iv)]. · ht ω X (D) denotes the [normalized] height function on U X (Q) associated to ω X (D), which is well-defined up to a “bounded discrepancy” [cf. [GenEll], Proposition 1.4, (iii)]. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 41 In order to apply the theory of the present series of papers, it is neceesary to construct suitable initial Θ-data, as follows. Corollary 2.2. (Construction of Suitable Initial Θ-Data) Suppose that X = P 1 Q is the projective line over Q, and that D X is the divisor consisting of the three points “0”, “1”, and “∞”. We shall regard X as the “λ-line” i.e., we shall regard the standard coordinate on X = P 1 Q as the “λ” in the Legendre form “y 2 = x(x−1)(x−λ)” of the Weierstrass equation defining an elliptic curve and hence as being equipped with a natural classifying morphism U X (M ell ) Q [cf. the discussion preceding Proposition 1.8]. Let K V U X (Q) be a compactly bounded subset [i.e., regarded as a subset of X(Q) cf. Re- mark 2.3.1, (vi), below; [GenEll], Example 1.3, (ii)] whose support contains the nonarchimedean prime “2”. Suppose further that K V satisfies the following condi- tion: (∗ j-inv ) If v V(Q) denotes the nonarchimedean prime “2”, then the image of the subset K v U X (Q v ) associated to K V [cf. the notational conventions of [GenEll], Example 1.3, (ii)] via the j-invariant U X (M ell ) Q A 1 Q is a bounded subset of A 1 Q (Q v ) = Q v , i.e., is contained in a subset of the form 2 N j-inv · O Q Q v , where N j-inv Z, and O Q Q v denotes the v v ring of integers. Then: 2 (i) Write “log(q (−) )” (respectively, “log(q (−) )”) for the R-valued function on M ell (Q), hence also on U X (Q), obtained by forming the normalized degree “deg(−)” of the effective arithmetic divisor determined by the q-parameters of an elliptic curve over a number field at arbitrary nonarchimedean primes (respectively, at the nonarchimedean primes that do not divide 2) [cf. the invariant “log(q)” as- sociated, in the statement of Theorem 1.10, to the elliptic curve E F ]. Also, we shall write ht for the [normalized] height function on U X (Q) a function which is well-defined up to a “bounded discrepancy” [cf. the discussion preced- ing [GenEll], Proposition 3.4] determined by the pull-back to X of the divisor at infinity of the natural compactification (M ell ) Q of (M ell ) Q . Then we have an equality of “bounded discrepancy classes” [cf. [GenEll], Definition 1.2, (ii), as well as Remark 2.3.1, (ii), below] 2 1 6 · log(q (−) ) 1 6 · log(q (−) ) 1 6 · ht ht ω X (D) of functions on K V U X (Q). (ii) There exist · a positive real number H unif which is independent of K V and · positive real numbers C K and H K which depend only on the choice of the compactly bounded subset K V 42 SHINICHI MOCHIZUKI such that the following property is satisfied: Let d be a positive integer,  d a positive def real number 1. Set δ = 2 12 · 3 3 · 5 · d. Then there exists a finite subset Exc d U X (Q) ≤d which depends only on K V , d, and  d , contains all points corresponding to elliptic curves that admit automorphisms of order > 2, and satisfies the following property: The function “log(q (−) )” of (i) is 4+ d + H K H unif ·  −3 d · d on Exc d . Let E F be an elliptic curve over a number field F Q that determines a Q-valued point of (M ell ) Q which lifts [not necessarily uniquely!] to a point x E  U X (F ) U X (Q) ≤d such that x E K V , x E ∈ Exc d . Write F mod for the minimal field of definition of the corresponding point M ell (Q) and F mod def F tpd = F mod ( E F mod [2] ) F for the “tripodal” intermediate field obtained from F mod by adjoining the fields of definition of the 2-torsion points of any model of E F × F Q over F mod [cf. Proposition 1.8, (ii), (iii)]. Moreover, we assume that the (3·5)-torsion points of E F are defined over F , and that F = F mod ( −1, E F mod [2 · 3 · 5] ) def = F tpd ( −1, E F tpd [3 · 5] ) i.e., that F is obtained from F tpd by adjoining −1, together with the fields of definition of the (3 · 5)-torsion points of a model E F tpd of the elliptic curve E F × F Q over F tpd determined by the Legendre form of the Weierstrass equation discussed above [cf. Proposition 1.8, (vi)]. [Thus, it follows from Proposition 1.8, (iv), that E F = E F tpd × F tpd F over F , so x E U X (F tpd ) U X (F ); it follows from Proposi- tion 1.8, (v), that E F has stable reduction at every element of V(F ) non .] Write log(q ) (respectively, log(q 2 )) for the result of applying the function “log(q (−) )” 2 (respectively, “log(q (−) )”) of (i) to x E . Then E F and F mod arise as the “E F and “F mod for a collection of initial Θ-data as in Theorem 1.10 that, in the notation of Theorem 1.10, satisfies the following conditions: (C1) (log(q )) 1/2 l 10δ · (log(q )) 1/2 · log(2δ · log(q )); (C2) we have inequalities 1 6 · log(q) 2 1 6 · log(q ) 1 6 · log(q ) (1 +  E ) · (log-diff X (x E ) + log-cond D (x E )) + C K where we write def )))  E = (60δ) 2 · log(2δ·(log(q (log(q )) 1/2 INTER-UNIVERSAL TEICHMÜLLER THEORY IV 43 [i.e., so  E depends on the integer d, as well as on the elliptic curve E F !], and we observe, relative to the notation of Theorem 1.10, that [it follows tautologically from the definitions that] we have an equality log-diff X (x E ) = log(d F tpd ), as well as inequalities log(f F tpd ) log-cond D (x E ) log(f F tpd ) + log(2l). (iii) The positive real number H unif of (ii) [which is independent of K V !] may be chosen in such a way that the following property is satisfied: Let d be a positive integer,  d and  positive real numbers 1. Then there exists a finite subset Exc ,d U X (Q) ≤d which depends only on K V , , d, and  d such that, in the notation of (ii), the function “log(q (−) )” of (i) is 4+ d H unif ·  −3 ·  −3 + H K d · d on Exc ,d , and, moreover, the invariant  E associated to an elliptic curve E F as in (ii) [i.e., that satisfies certain conditions which depend on K V and d] satisfies the inequality  E  whenever the point x E U X (F ) satisfies the condition x E ∈ Exc ,d . Proof. First, we consider assertion (i). We begin by observing that, in light of the condition (∗ j-inv ) that was imposed on K V , it follows immediately from the various definitions involved that 2 log(q (−) ) log(q (−) ) where we observe that the function “log(q (−) )” may be identified with the func- tion “deg of the discussion preceding [GenEll], Proposition 3.4 on K V U X (Q). In a similar vein, since the support of K V contains the unique archimedean prime of Q, it follows immediately from the various definitions involved [cf. also Remark 2.3.1, (vi), below] that log(q (−) ) ht on K V U X (Q) [cf. the argument of the final paragraph of the proof of [GenEll], 2 Lemma 3.7]. Thus, we conclude that log(q (−) ) log(q (−) ) ht on K V U X (Q). Finally, since [as is well-known] the pull-back to X of the divisor at infinity of the natural compactification (M ell ) Q of (M ell ) Q is of degree 6, while the line bundle ω X (D) is of degree 1, the equality of BD-classes 16 · ht ht ω X (D) on K V U X (Q) follows immediately from [GenEll], Proposition 1.4, (i), (iii). This completes the proof of assertion (i). Next, we consider assertion (ii). First, let us recall that if the once-punctured elliptic curve associated to E F fails to admit an F -core, then there are only four possibilities for the j-invariant of E F [cf. [CanLift], Proposition 2.7]. Thus, if we take the set Exc d to be the [finite!] collection of points corresponding to these four j-invariants, then we may assume that the once-punctured elliptic curve associated to E F admits an F -core, hence, in particular, does not have any automorphisms of order > 2 over Q. In the discussion to follow, it will be necessary to enlarge 44 SHINICHI MOCHIZUKI the finite set Exc d several times, always in a fashion that depends only on K V , d, and  d [i.e., but not on x E !] and in such a way that the function “log(q (−) )” of 4+ d + H K on Exc d for some positive real number H unif that (i) is H unif ·  −3 d · d is independent of K V and some positive real number H K that depends only on K V [i.e., but not on d or  d !]. Next, let us write h = log(q ) = def  1 [F :Q] · h v · f v · log(p v ) v∈V(F ) non that is to say, h v = 0 for those v at which E F has good reduction; h v N ≥1 is the local height of E F [cf. [GenEll], Definition 3.3] for those v at which E F has bad multiplicative reduction. Now it follows [by assertion (i); [GenEll], Proposition 1.4, (iv)] that the inequality h 1/2 < ξ prm [cf. the notation of Proposition 2.1, (ii)] implies that there is only a finite number of possibilities for the j-invariant of E F . Thus, by possibly enlarging the finite set Exc d [in a fashion that depends only on K V , d, and  d and in such a way that h H unif on Exc d for some positive real number H unif that is independent of K V ], we may assume without loss of generality that the inequality h 1/2 ξ prm 5 holds. Thus, since [F : Q] δ [cf. the properties (E3), (E4), (E5) in the proof of Theorem 1.10], it follows that   h −1/2 · h v · f v · log(p v ) h −1/2 · h v · log(p v ) δ · h 1/2 [F : Q] · h 1/2 =  h v v h −1/2  · h v · log(p v ) h 1/2 h v v log(p v ) h 1/2 and · h 1/2 · log(2δ · h) 2 · [F : Q] · h 1/2 · log(2 · [F : Q] · h)  2 · h −1/2 · log(2 · h v · f v · log(p v )) · h v · f v · log(p v ) h v  =0  h −1/2 · log(h v ) · h v h v  =0  h v  h −1/2 · log(h v ) · h v h v h 1/2 log(h v ) h 1/2 where the sums are all over v V(F ) non [possibly subject to various conditions, as indicated], and we apply the elementary estimate 2 · log(p v ) 2 · log(2) = log(4) 1 [cf. the property (E6) in the proof of Theorem 1.10]. Thus, in summary, we conclude from the estimates made above that if we take A to be the [finite!] set of prime numbers p such that p either INTER-UNIVERSAL TEICHMÜLLER THEORY IV 45 (S1) is h 1/2 , (S2) divides a nonzero h v for some v V(F ) non , or (S3) is equal to p v for some v V(F ) non for which h v h 1/2 , then it follows from Proposition 2.1, (ii), together with our assumption that h 1/2 ξ prm , that, in the notation of Proposition 2.1, (ii), θ A 2 · h 1/2 + δ · h 1/2 + · h 1/2 · log(2δ · h) · h 1/2 · log(2δ · h) −ξ prm + · h 1/2 · log(2δ · h) where we apply the estimates δ 2 and log(2δ · h) log(4) 1 [cf. the property (E6) in the proof of Theorem 1.10]. In particular, it follows from Proposition 2.1, (i), (ii), together with our assumption that h 1/2 5 1, that there exists a prime number l such that (P1) (5 ≤) h 1/2 l 10δ · h 1/2 · log(2δ · h) (≤ 20 · δ 2 · h 2 ) [cf. the condition (C1) in the statement of Corollary 2.2]; (P2) l does not divide any nonzero h v for v V(F ) non ; (P3) if l = p v for some v V(F ) non , then h v < h 1/2 . Next, let us observe that, again by possibly enlarging the finite set Exc d [in a fashion that depends only on K V , d, and  d and in such a way that h H K on Exc d for some positive real number H K that depends only on K V ], we may assume without loss of generality that, in the terminology of [GenEll], Lemma 3.5, (P4) E F does not admit an l-cyclic subgroup scheme. Indeed, the existence of an l-cyclic subgroup scheme of E F would imply that l−2 24 · log(q ) 2 · log(l) + T K where we apply assertion (i), (P2), the displayed inequality of [GenEll], Lemma 3.5, and the final inequality of the display of [GenEll], Proposition 3.4; we take the “” of [GenEll], Lemma 3.5, to be 1; we write T K for the positive real number [which depends only on the choice of the compactly bounded subset K V ] that results from the various “bounded discrepancies” implicit in these inequalities. Since l 5 [cf. (P1)], it follows that 1 2 · log(l) 48 · l−2 24 [cf. the property (E6) in the proof of Theorem 1.10], and hence that the inequality of the preceding display implies that log(q ) is bounded. On the other hand, [by assertion (i); [GenEll], Proposition 1.4, (iv)] this implies that there is only a finite number of possibilities for the j-invariant of E F . This completes the proof of the above observation. Next, let us note that it follows immediately from (P1), together with Propo- sition 2.1, (i), that h 1/2 · log(l) h 1/2 · log(20 · δ 2 · h 2 ) 2 · h 1/2 · log(5δ · h) 8 · h 1/2 · log(2 · δ 1/4 · h 1/4 ) 8 · h 1/2 · 2 · δ 1/4 · h 1/4 = 16 · δ 1/4 · h 3/4 46 SHINICHI MOCHIZUKI where we apply the estimates 20 5 2 and 5 2 4 . In particular, we observe that, again by possibly enlarging the finite set Exc d [in a fashion that depends only on K V , d, and  d and in such a way that h H unif ·d+H K on Exc d for some positive real number H unif that is independent of K V and some positive real number H K that depends only on K V ], we may assume without loss of generality that def (P5) if we write V bad mod for the set of nonarchimedean valuations V mod = V(F mod ) that do not divide 2l and at which E F has bad multiplicative reduction, then V bad mod  = ∅. Indeed, if V bad mod = ∅, then it follows, in light of the definition of h, from (P3), assertion (i), and the computation performed above, that h log(q 2 ) h 1/2 · log(l) 16 · δ 1/4 · h 3/4 an inequality which implies that h 1/4 , hence h itself, is bounded. On the other hand, [by assertion (i); [GenEll], Proposition 1.4, (iv)] this implies that there is only a finite number of possibilities for the j-invariant of E F . This completes the proof of the above observation. This property (P5) implies that (P6) the image of the outer homomorphism Gal(Q/F ) GL 2 (F l ) determined by the l-torsion points of E F contains the subgroup SL 2 (F l ) GL 2 (F l ). Indeed, since, by (P5), E F has bad multiplicative reduction at some valuation V bad mod  = ∅, (P6) follows formally from (P2), (P4), and [GenEll], Lemma 3.1, (iii) [cf. the proof of the final portion of [GenEll], Theorem 3.8]. Now it follows formally from (P1), (P2), (P5), and (P6) that, if one takes “F to be Q, “F to be the number field F of the above discussion, “X F to be the once-punctured elliptic curve associated to E F , “l” to be the prime number l of bad the above discussion, and “V bad mod to be the set V mod of (P5), then there exist data “C K ”, “V”, and “” such that all of the conditions of [IUTchI], Definition 3.1, (a), (b), (c), (d), (e), (f ), are satisfied, and, moreover, that (P7) the resulting initial Θ-data (F /F, X F , l, C K , V, V bad mod , ) satisfies the various conditions in the statement of Theorem 1.10. Here, we note in passing that the crucial existence of data “V” and “” satisfying the requisite conditions follows, in essence, as a consequence of the fact [i.e., (P6)] that the Galois action on l-torsion points contains the full special linear group SL 2 (F l ). In light of (P7), we may apply Theorem 1.10 [cf. also Remark 1.10.7, (i)] to conclude that 1 6 · log(q) (1 + 20·d l mod ) · (log(d F tpd ) + log(f F tpd )) + 20 · (d mod · l + η prm ) (1 + δ · h −1/2 ) · (log(d F tpd ) + log(f F tpd )) + 200 · δ 2 · h 1/2 · log(2δ · h) + 20η prm INTER-UNIVERSAL TEICHMÜLLER THEORY IV 47 where we apply (P1), as well as the estimates 20 · d mod d mod δ. Next, let us observe that it follows from (P3), together with the computation of the discussion preceding (P5), that 2 1 6 · log(q ) 16 · log(q) 1/2 1 · log(l) 6 · h 1/2 h 1/2 1 · log(5δ · h) 3 · h · log(2δ · h) where we apply the estimates 1 h and 5 2 3 . Thus, since, by assertion (i), the difference 16 ·log(q )− 16 ·log(q 2 ) is bounded by some positive real number B K [which depends only on the choice of the compactly bounded subset K V ], we conclude that 1 6 · h = 1 6 · log(q ) (1 + δ · h −1/2 ) · (log(d F tpd ) + log(f F tpd )) + (15δ) 2 · h 1/2 · log(2δ · h) + 12 · C K (1 + δ · h −1/2 ) · (log(d F tpd ) + log(f F tpd )) + 16 · h · 25 · (60δ) 2 · h −1/2 · log(2δ · h) + 12 · C K def where we write C K = 40η prm + 2B K , and we apply the estimate 6 · 5 2 · 4 2 . Now let us set def  E = (60δ) 2 · h −1/2 · log(2δ · h) (≥ 5 · δ · h −1/2 ); def  d = 1 16 ·  d (< 12 1) where we apply the estimates h 1, log(2δ · h) log(2δ) log(4) 1 [cf. the property (E6) in the proof of Theorem 1.10], and  d 1. Note that the inequality 1 <  E = (60δ) 2 · h −1/2 · log(2δ · h) = ( d ) −1 · (60δ) 2 · h −1/2 · log(2 d · δ d · h d ) ( d ) −1 · (60δ) 2+ d · h −(1/2− d )  (1/2− d )  −3 4+ d −1 ( d ) · (60δ) · h where we apply Proposition 2.1, (i), together with the estimates 1 16 = 3; 1 8   d d 2 2 +  d 32 +  d = 4 +  d 5 1 8   d d 2 [both of which are consequences of the fact that 0 <  d 1 3], as well as the estimates 0 <  d 1, 60δ 1, and h 1 implies a bound on h, hence, [by assertion (i); [GenEll], Proposition 1.4, (iv)] that there is only a finite number of possibilities for the j-invariant of E F . Thus, by possibly enlarging the finite set Exc d [in a fashion that depends only on K V , d, and  d and in such a way that 4+ d h H unif ·  −3 + H K on Exc d for some positive real number H unif that is d · d independent of K V and some positive real number H K that depends only on K V ], we may assume without loss of generality that  E 1. 48 SHINICHI MOCHIZUKI Thus, in summary, we obtain inequalities 1 6 · h (1 25 ·  E ) −1 (1 + 15 ·  E ) · (log(d F tpd ) + log(f F tpd )) + (1 25 ·  E ) −1 · 12 · C K (1 +  E ) · (log(d F tpd ) + log(f F tpd )) + C K by applying the estimates 1 + 15 ·  E 1 25 ·  E 1 +  E ; 1 25 ·  E 1 2 both of which are consequences of the fact that 0 <  E 1. Thus, in light of (P1), together with the observation that it follows immediately from the definitions [cf. also Proposition 1.8, (vi)] that we have an equality log-diff X (x E ) = log(d F tpd ), as well as inequalities log(f F tpd ) log-cond D (x E ) log(f F tpd )+log(2l), we conclude that both of the conditions (C1), (C2) in the statement of assertion (ii) hold for C K as defined above. This completes the proof of assertion (ii). Finally, assertion (iii) follows immediately by applying the argument applied above in the proof of assertion (ii) in the case of the inequality “1 <  E to the inequality “ <  E ”. Remark 2.2.1. (i) Before proceeding, we pause to examine the asymptotic behavior of the bound obtained in Corollary 2.2, (ii), in the spirit of the discussion of Remark 1.10.5, (ii). For simplicity, we assume that F tpd = Q [so d mod = 1]; we write def def h = log(q ) [cf. the proof of Corollary 2.2, (ii)] and δ = log-diff X (x E ) + log-cond D (x E ) = log-cond D (x E ) [i.e., notation that is closely related to the nota- tion of Remark 1.10.5, (ii), but differs substantially from the notation of Corollary 2.2, (ii)]. Thus, it follows immediately from the definitions that 1 < log(3) δ and 1 < log(3) h. In particular, the bound under consideration may be written in the form 1/2 1 · log(δ) 6 · h δ + ∗· δ where “∗” is to be understood as denoting a fixed positive real number; we observe that the ratio h/δ is always a positive real number which is bounded below by the definition of h and δ and bounded above precisely as a consequence of the bound under consideration. In this context, it is of interest to observe that the form of the “ term” δ 1/2 · log(δ) is strongly reminiscent of well-known interpretations of the Riemann hypothesis in terms of the asymptotic behavior of the function defined by considering the number of prime numbers less than a given natural number. Indeed, from the point of view of weights [cf. also the discussion of Remark 2.2.2 below], it is natural to regard the [logarithmic] height of a line bundle as an object that has the same weight as a single Tate twist, or, from a more classical point of view, “2πi” raised to the power 1. On the other hand, again from the point of view of weights, the variable “s” of the Riemann zeta function ζ(s) may be thought of as corresponding precisely to the number of Tate twists under consideration, so a single Tate twist corresponds to “s = 1”. Thus, from this point of view, “s = 12 ”, i.e., the critical line that appears in the Riemann hypothesis, corresponds precisely to the square roots of the [logarithmic] heights under consideration, i.e., to h 1/2 , δ 1/2 . Moreover, from the point of view of the computations that underlie Theorem INTER-UNIVERSAL TEICHMÜLLER THEORY IV 49 1.10 and Corollary 2.2, (ii) [cf., especially, the proof of Corollary 2.2, (ii); Steps (v), (viii) of the proof of Theorem 1.10; the contribution of “b i in the second displayed inequality of Proposition 1.4, (iii)], this δ 1/2 arises as a result of a sort of “balance”, or “duality” i.e., that occurs as one increases the size of the auxiliary prime l [cf. the discussion of Remark 1.10.5, (ii)] between the archimedean decrease in the “ term” δl and the nonarchimedean increase in the “ term” l [i.e., that arises from a certain estimate, in the proof of Proposition 1.2, (i), (ii), of the radius of convergence of the p-adic logarithm]. That is to say, such a global arithmetic duality is reminiscent of the functional equation of the Riemann zeta function [cf. the discussion of (iii) below]. (ii) In [vFr], §2, it is conjectured that, in the notation of the discussion of (i),   log lim sup 1 6 ·h−δ log(h) = 1 2 and observed that the 12 that appears here is strongly reminiscent of the 12 that appears in the Riemann hypothesis. In the situation of Corollary 2.2, (ii), bounds are only obtained on abc sums that belong to the compactly bounded subset K V under consideration; such bounds, i.e., as discussed in (i), thus imply that this lim sup is 12 . On the other hand, it is shown in [vFr], §2 [cf. also the references quoted in [vFr]], that, if one allows arbitrary abc sums [i.e., which are not necessarily assumed to be contained in a single compactly bounded subset K V ], then this lim sup is 12 . It is not clear to the author at the time of writing whether or not such estimates [i.e., to the effect that the lim sup under consideration is 12 ] hold even if one imposes the restriction that the abc sums under consideration be contained in a single compactly bounded subset K V . (iii) In the well-known classical theory of the Riemann zeta function, the Riemann zeta function is closely related to the theta function, i.e., by means of the Mellin transform. In light of the central role played by theta functions in the theory of the present series of papers, it is tempting to hope, especially in the context of the observations of (i), (ii), that perhaps some extension of the theory of the present series of papers i.e., some sort of “inter-universal Mellin transform” may be obtained that allows one to relate the theory of the present series of papers to the Riemann zeta function. (iv) In the context of the discussion of (iii), it is of interest to recall that, rela- tive to the analogy between number fields and one-dimensional function fields over finite fields, the theory of the present series of papers may be thought of as being analogous to the theory surrounding the derivative of a lifting of the Frobenius morphism [cf. the discussion of [IUTchI], §I4; [IUTchIII], Remark 3.12.4]. On the other hand, the analogue of the Riemann hypothesis for one-dimensional func- tion fields over finite fields may be proven by considering the elementary geometry of the [graph of the] Frobenius morphism. This state of affairs suggests that perhaps some sort of “integral” of the theory of the present series of papers could shed light on the Riemann hypothesis in the case of number fields. (v) One way to summarize the point of view discussed in (i), (ii), and (iii) is as follows: The asymptotic behavior discussed in (i) suggests that perhaps one 50 SHINICHI MOCHIZUKI should expect that the inequality constituted by well-known interpretations of the Riemann hypothesis in terms of the asymptotic behavior of the function defined by considering the number of prime numbers less than a given natural number may be obtained as some sort of “restriction” (ABC inequality)| canonical number of some sort of “ABC inequality” [i.e., some sort of bound of the sort obtained in Corollary 2.2, (ii)] to some sort of “canonical number” [i.e., where the term “number” is to be understood as referring to an abc sum]. Here, the descriptive “canonical” is to be understood as expressing the idea that one is not so much interested in considering a fixed explicit “number/abc sum”, but rather some sort of suitable abstraction of the sort of sequence of numbers/abc sums that gives rise to the lim sup value of 12 discussed in (ii). Of course, it is by no means clear precisely how such an “abstraction” should be formulated, but the idea is that it should represent some sort of average over all possible addition operations in the number field [in this case, Q] under consideration or [perhaps equivalently] some sort of “arithmetic measure or distribution” constituted by such a collection of all possible addition operations that somehow amounts to a sort of arithmetic analogue of the measure that gives rise to the classical Mellin transform [i.e., that appears in the discussion of (iii)]. Remark 2.2.2. In the context of the discussion of weights in Remark 2.2.1, (i), it is of interest to recall the significance of the Gaussian integral  e −x dx = 2 π −∞ in the theory of the present series of papers [cf. [IUTchII], Introduction; [IUTchII], Remark 1.12.5, as well as Remark 1.10.1 of the present paper]. Indeed, typically discussions of the Riemann zeta function ζ(s), or more general L-functions, in the context of conventional arithmetic geometry are concerned principally with the behavior of such functions at integral values [i.e., Z] of the variable s. Such integral values of the variable s correspond to integral Tate twists, i.e., at a more concrete level, to integral powers of the quantity 2πi. If one neglects nonzero factors Q(i), then such integral powers may be regarded as integral powers of π [or 2π]. At the level of classical integrals, the notion of a single Tate twist may be thought of as corresponding to the integral  = S 1 over the unit circle S 1 ; at the level of schemes, the notion of a single Tate twist may be thought of as corresponding to the scheme G m . On the other hand, whereas INTER-UNIVERSAL TEICHMÜLLER THEORY IV 51 the conventional theory of Tate twists in arithmetic geometry only involves integral powers of a single Tate twist, i.e., corresponding, in essence, to integral powers of π, the Gaussian integral may be thought of as a sort of fundamental integral representation of the notion of a “Tate semi-twist”. From this point of view, scheme-theoretic Hodge-Arakelov theory may be thought of as a sort of fundamen- tal scheme-theoretic represention of the notion of a “Tate semi-twist” [cf. the discussion of [IUTchII], Remark 1.12.5]. Thus, in summary, (a) the Gaussian integral, (b) scheme-theoretic Hodge-Arakelov theory, (c) the inter-universal Teichmüller theory developed in the present series of papers, and (d) the Riemann hypothesis, may all be thought of as “phenomena of weight 12 ”, i.e., at a concrete level, phenomena that revolve around arithmetic versions of π”. Moreover, we observe that in the first three of these four examples, the essential nature of the notion of “weight 12 may be thought of as being reflected in some sort of exponential of a quadratic form. This state of affairs is strongly reminiscent of (1) the Griffiths semi-transversality of the crystalline theta object that occurs in scheme-theoretic Hodge-Arakelov theory [cf. [HASurII], Theorem 2.8; [IUTchII], Remark 1.12.5, (i)], which corresponds essentially [cf. the discussion of the proof of [HASurII], Theorem 2.10] to the quadratic form that appears in the exponents of the well-known series expansion of the theta function; (2) the quadratic nature of the commutator of the theta group, which is applied, in [EtTh] [cf. the discussion of [IUTchIII], Remark 2.1.1], to derive the various rigidity properties which are interpreted, in [IUTchII], §1, as multiradiality properties an interpretation that is strongly rem- iniscent, if one interprets “multiradiality” in terms of “connections” and “parallel transport” [cf. [IUTchII], Remark 1.7.1], of the quadratic form discussed in (1); (3) the essentially quadratic nature of the “ term” · δl + · l [which, we recall, occurs at the level of addition of heights, i.e., log-volumes!] in the discussion of Remark 1.10.5, (ii). Remark 2.2.3. The discussion of Remark 2.2.1 centers around the content of Corollary 2.2, (ii), in the case of elliptic curves defined over Q. On the other hand, if, in the context of Corollary 2.2, (ii), (iii), one considers the case where d is an arbitrary positive integer [i.e., which is not necessarily bounded, as in the situation of Corollary 2.3 below!], then the inequalities obtained in (C2) of Corollary 2.2, (ii), may be regarded, by applying Corollary 2.2, (iii), as a sort of “weak version” of the so-called “uniform ABC Conjecture”. That is to say, these inequalities constitute only a “weak version” in the sense that they are restricted to rational points that lie in the compactly bounded subset K V , and, moreover, the bounds 52 SHINICHI MOCHIZUKI given for the function “log(q (−) )” [i.e., in essence, the “height”] on Exc d and Exc ,d depend on the positive integer d [cf. also Remark 2.3.2, (i), below]. Remark 2.2.4. Before proceeding, it is perhaps of interest to consider the ideas discussed in Remarks 2.2.1, 2.2.3 above in the context of the analogy between the theory of the present series of papers and the p-adic Teichmüller theory of [pOrd], [pTeich] [cf. also [InpTch]]. (i) The analogy between the theory of the present series of papers and the p- adic Teichmüller theory of [pOrd], [pTeich] [cf. also [InpTch]] is discussed in detail in [IUTchIII], Remark 1.4.1, (iii); [IUTchIII], Remark 3.12.4. In a word, this discussion concerns similarities between the log-theta-lattice considered in the present series of papers and the canonical Frobenius lifting on the ordinary locus of a canonical curve of the sort that appears in the theory of [pOrd]. Such a canonical curve is associated, in the theory of [pOrd], to a hyperbolic curve equipped with a nilpotent ordinary indigenous bundle over a perfect field of positive characteristic p. On the other hand, the theory of [pOrd] also addresses the universal case, i.e., of the tautological hyperbolic curve equipped with a nilpotent ordinary indigenous bundle over the moduli stack of such data in positive characteristic. In particular, one constructs, in the theory of [pOrd], a canonical Frobenius lifting over a canonical p-adic lifting of this moduli stack. This moduli stack is smooth of dimension 3g 3 + r [i.e., in the case of hyperbolic curves of type (g, r)] over F p , hence, in particular, is far from perfect [i.e., as an algebraic stack in positive characteristic]. Thus, in some sense, the gap between the theory of the present series of papers, on the one hand, and the notion discussed in Remark 2.2.1, (v), of a “canonical number/arithmetic measure/distribution”, on the other, may be understood, in the context of the analogy with p-adic Teichmüller theory, as corresponding to the gap between the theory of [pOrd] specialized to the case of “canonical curves”, i.e., over perfect base fields, and the full, non- specialized version of the theory of [pOrd], i.e., which concerns canonical Frobenius liftings over the non-perfect moduli stack of hyperbolic curves equipped with a nilpotent ordinary indigenous bundle. That is to say, in a word, one has a correspondence “canonical number” ←→ modular Frobenius liftings. (ii) In general, the gap between perfect and non-perfect schemes in positive characteristic is reflected precisely in the extent to which the Frobenius morphism on the scheme under consideration fails to be an isomorphism. Put another way, the “phenomenon” of non-perfect schemes in positive characteristic may be thought of as a reflection of the distortion arising from the Frobenius morphism in positive characteristic. In the context of the theory of the present series of papers [cf. [IUTchIII], Remark 1.4.1, (iii)], the Frobenius morphism in positive characteristic corresponds to the log-link. Moreover, in the context of the inequalities obtained in Theorem 1.10, the term “∗ · l” [cf. the discussion of Remark 1.10.5, (ii)] arises, in the computations that underlie the proof of Theorem 1.10, precisely by applying the prime number theorem [i.e., Proposition 1.6] to sum up the log-volumes of the INTER-UNIVERSAL TEICHMÜLLER THEORY IV 53 log-shells [cf. Propositions 1.2, (ii); 1.4, (iii)] at various nonarchimedean primes of the number field. In this context, we make the following observations: · These log-volumes of log-shells may be thought of as numerical measures of the distortions of the integral structure [i.e., relative to the “arithmetic holomorphic” integral structures determined by the various local rings of integers “O”] that arise from the log-link. · Estimates arising from the prime number theorem are closely related to the aspects of the Riemann zeta function that are discussed in Remark 2.2.1. · The prime number l is, ultimately, in the computations of Corollary 2.2, (ii) [cf., especially, condition “(C1)”], taken to be roughly of the order of the square root of the height of the elliptic curve under consideration. That is to say, since the height of an elliptic curve “roughly controls” [i.e., up to finitely many possibilities] the moduli of the elliptic curve, the prime number l may be thought of as a sort of rough numerical representation of the moduli of the elliptic curve under consideration. Thus, in summary, these observations strongly support the point of view that the computations that underlie the proof of Theorem 1.10 may be thought of as constituting one convincing piece of evidence for the point of view discussed in (i) above. (iii) In the context of the discussion of (i), (ii), it is of interest to recall that the modular Frobenius liftings of [pOrd] are not defined over the algebraic moduli stack of hyperbolic curves over Z p , but rather over the p-adic formal algebraic stack [which is formally étale over the corresponding algebraic moduli stack of hyperbolic curves over Z p ] constituted by the canonical lifting to Z p of the moduli stack of hyperbolic curves equipped with a nilpotent ordinary indigenous bundle. That is to say, the gap between this [“p-adically analytic”] p-adic formal algebraic stack parametrizing “ordinary” data and the corresponding algebraic moduli stack of hyperbolic curves over Z p is highly reminiscent, in the context of Corollary 2.2, (ii) [cf. also Remark 2.2.3], of the gap between the [“arith- metically analytic”] compactly bounded subsets “K V [i.e., consisting of elliptic curves that satisfy the condition of being in “sufficiently gen- eral position” a condition that may be thought of as a sort of “global arithmetic version of ordinariness”] and the entire set of algebraic points “M ell (Q)”. Ultimately, this gap between “K V and “M ell (Q)” will be bridged, in Corollary 2.3 below, by applying [GenEll], Theorem 2.1, which may be thought of as a sort of arithmetic analytic continuation by means of [noncritical] Belyi maps [cf. the discussion of Belyi maps in the Introduction to [GenEll]]. This state of affairs is reminiscent of the “arithmetic analytic continuation via Belyi maps” that occurs in the theory of [AbsTopIII] [i.e., in essence, the theory of Belyi cuspidalizations] that is applied in [IUTchI], §5 [cf. [IUTchI], Remark 5.1.4]. Finally, in this context, 54 SHINICHI MOCHIZUKI we recall that the open immersion “ κ that appears in the discussion towards the end of [InpTch], §2.6 i.e., which embeds a sort of perfection of the p-adic formal algebraic stack discussed above into an essentially algebraic stack given by a certain pro-finite covering of the corresponding algebraic moduli stack of hyperbolic curves over Z p determined by considering representations of the geometric fundamental group into P GL 2 (Z p ) may be thought of as a sort of p-adic analytic continuation to this corresponding algebraic moduli stack of the essentially “p-adically analytic” theory of modular Frobenius liftings developed in [pOrd]. (iv) Finally, in the context of the discussion of (i), (ii), (iii), we observe that the issue discussed in Remark 2.2.1 of considering the asymptotic behavior of the theory of the present series of papers when l may be thought of as the problem of understanding how the theory of the present series of papers behaves as one passes from the discrete approximation of the elliptic curve under consideration constituted by the l-torsion points of the elliptic curve to the “full continuous theory” [cf. the discussion of [IUTchI], Remark 6.12.3, (i) ; [HASurI], §1.3.4]. This point of view is of interest in light of the theory of Bernoulli numbers, i.e., which, on the one hand, is, as is well-known, closely related to the values [at positive even integers] of the Riemann zeta function [cf. the discussion of Remark 2.2.1], and, on the other hand, is closely related to the passage from the discrete difference operator f (x) → f (x + 1) f (x) for, say, real-valued real analytic functions f (−) on the real line to the d f (x) continuous derivative operator f (x) → dx where we recall that the operator f (x) → f (x + 1) may be thought of as the d operator “e dx obtained by exponentiating this continuous derivative operator. We are now ready to state and prove the main theorem of the present §2, which may also be regarded as the main application of the theory developed in the present series of papers. Corollary 2.3. (Diophantine Inequalities) Let X be a smooth, proper, def geometrically connected curve over a number field; D X a reduced divisor; U X = X\D; d a positive integer;  R >0 a positive real number. Write ω X for the canonical sheaf on X. Suppose that U X is a hyperbolic curve, i.e., that the degree of the line bundle ω X (D) is positive. Then, relative to the notation reviewed above, one has an inequality of “bounded discrepancy classes” ht ω X (D)  (1 + )(log-diff X + log-cond D ) of functions on U X (Q) ≤d i.e., the function (1 + )(log-diff X + log-cond D ) ht ω X (D) is bounded below by a constant on U X (Q) ≤d [cf. [GenEll], Definition 1.2, (ii), as well as Remark 2.3.1, (ii), below]. Proof. One verifies immediately that the content of the statement of Corollary 2.3 coincides precisely with the content of [GenEll], Theorem 2.1, (i). Thus, it follows INTER-UNIVERSAL TEICHMÜLLER THEORY IV 55 from the equivalence of [GenEll], Theorem 2.1, that, in order to complete the proof of Corollary 2.3, it suffices to verify that [GenEll], Theorem 2.1, (ii), holds. That is to say, we may assume without loss of generality that: · X = P 1 Q is the projective line over Q; · D X is the divisor consisting of the three points “0”, “1”, and “∞”; · K V U X (Q) is a compactly bounded subset [cf. Remark 2.3.1, (vi), below] whose support contains the nonarchimedean prime “2”; · K V satisfies the condition “(∗ j-inv )” of Corollary 2.2. [Here, we note, with regard to the condition “(∗ j-inv )” of Corollary 2.2, that this  condition only concerns the behavior of K V U X (Q) ≤d as d varies; that is to say, this condition is entirely vacuous in situations, i.e., such as the situation  considered in [GenEll], Theorem 2.1, (ii), in which one is only concerned with K V U X (Q) ≤d for a fixed d.] Then it suffices to show that the inequality of BD-classes of functions [cf. [GenEll], Definition 1.2, (ii), as well as Remark 2.3.1, (ii), below] ht ω X (D)  (1 + )(log-diff X + log-cond D )  holds on K V U X (Q) ≤d . But such an inequality follows immediately, in light of the [relevant] equality of BD-classes of Corollary 2.2, (i), from Corollary 2.2, (ii) [cf. condition (C2)], (iii) [where we note that it follows immediately from the various definitions involved that d mod d]. This completes the proof of Corollary 2.3. Remark 2.3.1. We take this opportunity to correct some unfortunate misprints in [GenEll]. (i) The notation “ord v (−) : F v Z” in the final sentence of the first paragraph following [GenEll], Definition 1.1, should read “ord v (−) : F v × Z”. (ii) In [GenEll], Definition 1.2, (ii), the non-resp’d and first resp’d items in the display should be reversed! That is to say, the notation “α  F β” corresponds to “α(x) β(x) C”; the notation “α  F β” corresponds to “β(x) α(x) C”. (iii) The first portion of the first sentence of the statement of [GenEll], Corollary 4.4, should read: “Let Q be an algebraic closure of Q; . . . ”. (iv) The “log-diff M ell ([E L ]))” in the second inequality of the final display of the statement of [GenEll], Corollary 4.4, should read “log-diff M ell ([E L ])”. (v) The equality ht E (deg(E)/deg(ω X )) · ht ω X implicit in the final “≈” of the final display of the proof of [GenEll], Theorem 2.1, should be replaced by an inequality ht E  2 · (deg(E)/deg(ω X )) · ht ω X [which follows immediately from [GenEll], Proposition 1.4, (ii)], and the expression “deg(E)/deg(ω X )” in the inequality imposed on the choice of  should be replaced by the expression “2 · (deg(E)/deg(ω X ))”. 56 SHINICHI MOCHIZUKI (vi) Suppose that we are in the situation of [GenEll], Example 1.3, (ii). Let U X be an open subscheme. Then a “compactly bounded subset” K V U (Q) (⊆ X(Q)) of U (Q) is to be understood as a subset which forms a compactly bounded subset of X(Q) [i.e., in the sense discussed in [GenEll], Example 1.3, (ii)] and, moreover,  def satisfies the property that for each v V arc = V V(Q) arc (respectively, v  def V non = V V(Q) non ), the compact domain K v X arc (respectively, K v X(Q v )) is, in fact, contained in U (C) X(C) = X arc (respectively, U (Q v ) X(Q v )). In particular, this convention should be applied to the use of the term “compactly bounded subset” in the statements of [GenEll], Theorem 2.1; [GenEll], Lemma 3.7; [GenEll], Theorem 3.8; [GenEll], Corollary 4.4, as well as in the present paper [cf. the statement of Corollary 2.2; the proof of Corollary 2.3]. Although this convention was not discussed explicitly in [GenEll], Example 1.3, (ii), it is, in effect, discussed explicitly in the discussion of “compactly bounded subsets” at the beginning of the Introduction to [GenEll]. Moreover, this convention is implicit in the arguments involving compactly bounded subsets in the proof of [GenEll], Theorem 2.1. (vii) In the discussion following the second display of [GenEll], Example 1.3, (ii), the phrase “(respectively, X(Q v ))” should read “(respectively, X(Q v ))”. (viii) The first display of the paragraph immediately following [GenEll], Re- mark 3.3.1, should read as follows:     2 def  |α| =  α α  E v [i.e., the integral should be replaced by the absolute value of the integral]. Remark 2.3.2. (i) The reader will note that, by arguing with a “bit more care”, it is not difficult to give stronger versions of the various estimates that occur in Theorem 1.10; Corollaries 2.2, 2.3 and their proofs. Such stronger estimates are, however, beyond the scope of the present series of papers, so we shall not pursue this topic further in the present paper. (ii) On the other hand, we observe that the constant “1” in the inequality of the display of Corollary 2.3 cannot be improved cf. the examples constructed in [Mss]; the discussion of Remark 1.10.5, (ii), (iii). This observation is closely related to discussions of how the theory of the present series of papers breaks down if one attempts to replace the first power of the étale theta function by its N -th power for some integer N 2 [cf. the discussion in the final portion of Step (xi) of the proof of [IUTchIII], Corollary 3.12; the discussion of [IUTchIII], Remark 3.12.1, (ii)]. Such an “N -th power operation” may also be thought of as corresponding to the operation of replacing each Tate curve that occurs at an element V bad by INTER-UNIVERSAL TEICHMÜLLER THEORY IV 57 the Tate curve whose q-parameter is given by the N -th power of the q-parameter of the original Tate curve. This sort of operation on Tate curves may, in turn, be thought of as an isogeny of the sort that occurs in [GenEll], Lemma 3.5. On the other hand, the content of the proof of [GenEll], Lemma 3.5, consists essentially of a computation to the effect that even if one attempts to consider such “N -th power isogenies” at certain elements V bad , the global height of the elliptic curve over a number field that arises from such an isogeny will typically remain, up to a relatively small discrepancy, unchanged. In this context, we recall that this sort of invariance, up to a relatively small discrepancy, of the global height under isogeny is one of the essential observations that underlies the theory of [Falt] a state of affairs that is also of interest in light of the observations of Remark 2.3.3 below. Remark 2.3.3. Corollary 2.3 may be thought of as an effective version of the Mordell Conjecture. From this point of view, it is perhaps of interest to compare the “essential ingredients” that are applied in the proof of Corollary 2.3 [i.e., in effect, that are applied in the present series of papers!] with the “essential ingredients” applied in [Falt]. The following discussion benefited substantially from numerous e-mail and skype exchanges with Ivan Fesenko during the summer of 2015. (i) Although the author does not wish to make any pretensions to completeness in any rigorous sense, perhaps a rough, informal list of “essential ingredients” in the case of [Falt] may be given as follows: (a) results in elementary algebraic number theory related to the “geometry of numbers”, such as the theory of heights and the Hermite-Minkowski theorem; (b) the global class field theory of number fields; (c) the p-adic theory of Hodge-Tate decompositions; (d) the p-adic theory of finite flat group schemes; (e) generalities in algebraic geometry concerning isogenies and Tate modules of abelian varieties; (f) generalities in algebraic geometry concerning polarizations of abelian varieties; (g) the logarithmic geometry of toroidal compactifications of the moduli stack of abelian varieties. With regard to the global class field theory of (b), we observe that there are nu- merous different approaches to “dissecting” the proofs of the main results of global class field theory into more primitive components. To some extent, these different approaches correspond to different points of view arising from subsequent research on topics related to global class field theory. Here, we wish to consider the approach taken in [Lang1], Chapters VIII, IX, X, XI, which is attibuted [cf. the Introduction to [Lang1], Part Two] to Weber. It is of interest, in the context of the discussion of (vii) below, that this is apparently the oldest approach to proving certain por- tions of global class field theory. It is also of interest that this approach motivates the approach to global class field theory via consideration of density of primes in arithmetic progressions and splitting laws. This aspect of this approach of [Lang1] is closely related to various issues that appear in [Falt] [cf. [Lang1], Chapter VIII, 58 SHINICHI MOCHIZUKI §5]. Moreover, as we shall see in the following discussion, this approach of [Lang1] to global class field theory is well-suited to discussions of comparisons between the theory of [Falt] and the inter-universal Teichmüller theory developed in the present series of papers. At a technical level, the dissection of the global class field theory of (b), as developed in [Lang1], into more primitive components may be summarized as follows: (b-1) the local class field theory of p-adic local fields [cf. [Lang1], Chapter IX, §3; [Lang1], Chapter XI, §4]; (b-2) the theory of global density of primes [cf. the discussion surrounding the Universal Norm Index Inequality in [Lang1], Chapter VIII, §3]; (b-3) results in elementary algebraic number theory related to the “geometry of numbers” that give rise to the Unit Theorem [cf. [Lang1], Chapter V, §1; [Lang1], Chapter IX, §4]; (b-4) the global reciprocity law, i.e., in effect, the existence of a conductor for the Artin symbol [cf. [Lang1], Chapter X, §2]; (b-5) Kummer theory [cf. [Lang1], Chapter XI, §1]. Here, we recall that (b-1), (b-2), and (b-3) are applied in [Lang1], Chapter IX, §5, to verify the Universal Norm Index Equality for cyclic extensions. This Universal Norm Index Equality is then applied in [Lang1], Chapter X, §1, and combined with the theory of cyclotomic extensions in [Lang1], Chapter X, §2, to verify (b-4). Finally, (b-4) is combined with (b-5) in [Lang1], Chapter XI, §2, to complete the proof of the Existence Theorem for class fields. (ii) From the point of view of the theory of the present series of papers, (a), together with (b-3), is reminiscent of the elementary algebraic number theory char- acterization of nonzero global integers as roots of unity, which plays an im- portant role in the theory of the present series of papers [cf. [IUTchIII], the proof of Proposition 3.10]. Moreover, (a) is also reminiscent of the arithmetic degrees of line bundles that appear, for instance, in the form of global realified Frobe- nioids, throughout the theory of the present series of papers. Next, we observe that (b-1) is reminiscent of the p-adic absolute anabelian geometry of [AbsTopIII] [cf., e.g., [AbsTopIII], Corollary 1.10, (i)]. On the other hand, (b-2) is reminiscent of repeated applications of the Prime Number Theorem in the present paper [cf. Propositions 1.6; 2.1, (ii)]; this comparison between (b-2) and the Prime Num- ber Theorem will be discussed in more detail in (iv) below. Next, we observe [cf. the discussion of the latter portion of [IUTchIII], Remark 3.12.1, (iii)] that (b-4) is   × Z = {1}” in the reminiscent of the application of the elementary fact “Q >0 multiradial algorithms for cyclotomic rigidity isomorphisms in the number field case [cf. [IUTchI], Example 5.1, (v), as well as the discussion of [IUTchIII], Remarks 2.3.2, 2.3.3], that is to say, not only in the sense that both are closely related to the various cyclotomes that appear in global class field theory or inter-universal Teichmüller theory, but also in the sense that both may be regarded as analogues of the usual product formula [i.e., which appears at the level of Frobenius-like monoids isomorphic to the multiplicative group of nonzero elements of a number field] at the level of INTER-UNIVERSAL TEICHMÜLLER THEORY IV 59 certain [étale-like!] profinite Galois groups related to global number fields. On the other hand, (b-5) is reminiscent of the central role played through inter- universal Teichmüller theory by constructions modeled on classical Kummer the- ory. In fact, these comparisons involving (b-4) and (b-5) are closely related to one another and will be discussed in more detail in (v), (vi), and (vii) below. Next, we recall that Hodge-Tate decompositions as in (c) play a central role in the proofs of the main results of [pGC], which, in turn, underlie the theory of [Ab- sTopIII]. The ramification computations concerning finite flat group schemes as in (d) are reminiscent of various p-adic ramification computations concerning log- shells in [AbsTopIII], as well as in Propositions 1.1, 1.2, 1.3, 1.4 of the present paper. Whereas [Falt] revolves around the abelian/linear theory of abelian varieties [cf. (e)], the theory of the present series of papers depends, in an essential way, on various intricate manipulations involving finite étale coverings of hyperbolic curves, such as the use of Belyi maps in [GenEll], as well as in the Belyi cuspidal- izations applied in [AbsTopIII]. The theory of polarizations of abelian varieties applied in [Falt] [cf. (f)] is reminiscent of the essential role played by commutators of theta groups in the theory of [EtTh], which, in turn, plays a central role in the theory of the present series of papers. Finally, the logarithmic geometry of (g) is reminiscent of the combinatorial anabelian geometry of [SemiAnbd], which is applied, in [IUTchI], §2, to the logarithmic geometry of coverings of stable curves. (iii) One way to summarize the discussion of (ii) is as follows: many aspects of the theory of [Falt] may be regarded as “distant abelian ancestors” of certain aspects of the “anabelian-based theory” of inter- universal Teichmüller theory. Alternatively, one may observe that the overwhelmingly scheme-theoretic nature of the theory applied in [Falt] lies in stark contrast to the highly non-scheme-theoretic nature of the absolute anabelian geometry and theory of monoids/Frobenioids ap- plied in the present series of papers: that is to say, many aspects of the theory of [Falt] may be regarded as “distant arith- metically holomorphic ancestors” of certain aspects of the multira- dial and mono-analytic [i.e., “arithmetically real analytic”] theory developed in inter-universal Teichmüller theory. One way to understand this fundamental difference between the theory of [Falt] and inter-universal Teichmüller theory is by considering the naive goal of construct- ing some sort of “Frobenius morphism” on a number field [cf. the discussion of [FrdI], §I3], i.e., which has the effect of multiplying arithmetic degrees by a positive factor > 1: whereas the theory of [Falt] [cf., e.g., the argument of the proof of [GenEll], Lemma 3.5, as discussed in Remark 2.3.2, (ii)] may regarded as a reflection of the point of view that, so long as one respects the arithmetic holomorphic structure of scheme theory, such a “Frobenius morphism” on a number field cannot exist, the essential content of inter-universal Teichmüller theory may be summarized in a word as the assertion that, 60 SHINICHI MOCHIZUKI if one dismantles this arithmetic holomorphic structure in a suitably canon- ical fashion and allows oneself to work with multiradial/mono-analytic [i.e., “arithmetically quasi-conformal”] structures, then one can in- deed construct, in a very canonical fashion, such a “Frobenius mor- phism” on a number field. (iv) In the context of the comparison discussed in (ii) concerning (b-2), it is of interest to note that the fundamental difference discussed in (iii) between the theory of [Falt] and inter-universal Teichmüller theory is, in some sense, reflected in the difference between the theory of global density of primes [i.e., (b-2)] and the Prime Number Theorem. That is to say, the coherence of the sorts of collections of primes that appear in the theory of global density of primes may be thought of as a sort of representation, in the context of analytic number theory, of the arithmetic holomorphic structure of conventional scheme theory. By contrast, in the context of the Prime Number Theorem, primes of a number field appear, so to speak, one by one, i.e., in a fashion that is only possible if one deactivates, in the context of analytic number theory, the coherence that underlies the aggregrations of primes that appear in the theory of global density of primes. That is to say, this approach to treating primes “one by one” may be thought of as corresponding to the dismantling of arithmetic holomorphic structures that occurs in the context of the multiradial/mono-analytic structures that appear in inter-universal Teichmüller theory. Here, it is also of interest to note that the way in which one “deactivates aggregations of primes” in the context of the Prime Number Theorem may be thought of [cf. the discussion of [IUTchIII], Remark 3.12.2, (i), (c)] as a sort of dismantling of the ring structure of a number field into its underlying additive [i.e., counting primes “one by one”!] and multiplicative structures [i.e., the very notion of a prime!]. (v) The fundamental difference discussed in (iii) between the theory of [Falt] and inter-universal Teichmüller theory may also be seen in the context of the com- parison discussed in (ii) concerning (b-4). Indeed, the global reciprocity law of (b-4), which plays a central role in global class field theory, depends, in an essential way, on nontrivial relationships between local units [such as the unit determined by a prime number l at a nonarchimedean prime of a number field of residue char- acteristic  = l] at one prime of a number field and elements of local value groups [such as the element determined by l at a nonarchimedean prime of a number field of residue characteristic l] at another prime of the number field. Such nontrivial relationships are fundamentally incompatible with the splittings/decouplings of lo- cal units and local value groups that play a central role in the dismantling of arithmetic holomorphic structures that occurs in inter-universal Teichmüller theory [cf. the discussion of [IUTchIII], Remark 2.3.3, (i); [IUTchIII], Remark 3.12.2, (i), (a)]. This incompatibility [i.e., with nontrivial relationships between local units and local value groups at nonarchimedean primes with distinct residue characteristics] may also be seen quite explicitly in the structure of the various types of prime-strips that appear in inter-universal Teichmüller theory [cf. [IUTchI], Fig. I1.2]. That is to say, such nontrivial relationships, which form the content of the global reciprocity law INTER-UNIVERSAL TEICHMÜLLER THEORY IV 61 of global class field theory, may be thought of as a sort of global Galois-theoretic representation of the constraints that constitute the arithmetic holomorphic structure of conventional scheme theory. (vi) Another fundamental aspect of the comparison discussed in (ii) concern- ing (b-4) may be seen in the fact that whereas the global reciprocity law of global class field theory concerns the global reciprocity map, the cyclotomic rigid- ity algorithms of inter-universal Teichmüller theory to which (b-4) was compared appear in the context of Kummer-theoretic isomorphisms. That is to say, although both the global reciprocity map and Kummer-theoretic isomorphisms in- volve correspondences between multiplicative monoids associated to number fields and multiplicative monoids that arise from global Galois groups, one fundamental difference between these two types of correspondence lies in the fact that whereas Kummer-theoretic isomorphisms satisfy very strong covariant [with respect to functions] functoriality properties, the reciprocity maps that appear in various versions of class field theory tend not to satisfy such strong functoriality properties. This presence or absence of strong functoriality properties is, to a substantial extent, a reflection of the fact that whereas Kummer theory may be performed in a very straightforward, tautological, “general nonsense” fashion in a wide variety of situations, class field theory may only be conducted in very special arithmetic situations. This presence of strong functoriality properties [i.e., in the case of Kummer theory] is the essential reason for the central role played by Kummer theory [cf. (b-5)] in inter-universal Teichmüller theory, as well as in many situations that arise in anabelian geometry in general [cf., e.g., the theory of [Cusp]]. Indeed, the very tautological/ubiquitous/strongly functorial nature of Kummer theory makes it well-suited to the sort of dismantling of ring structures that occurs in inter-universal Teichmüller theory, as well as to the various evaluation operations of functions at special points that play a central role, in the context of Galois evaluation, in inter-universal Teichmüller theory [cf. the discussion of [IUTchIII], Remark 2.3.3]. By contrast, although there exist various higher-dimensional ver- sions of class field theory involving higher algebraic K-groups, these versions of class field theory are fundamentally incompatible with the crucial evaluation of function operations of the sort that occur in inter-universal Teichmüller theory. Indeed, more generally, except for very exceptional classical cases involving exponential functions in the case of Q or modular and elliptic functions in the case of imaginary quadratic fields, class field theory tends to be very ill-suited to situations that involve the evaluation of special functions at special points. Moreover, even if one restricts one’s attention, for instance, to functoriality with respect to passing to a finite extension field, the functoriality of the reciprocity maps that occur in class field theory are [unlike Kummer-theoretic isomorphisms!] contravariant [with respect to functions] and can only be made covariant if one applies some sort of nontrivial duality result to reverse the direction of the maps 62 SHINICHI MOCHIZUKI a state of affairs that makes class field theory very difficult to apply not only in inter- universal Teichmüller theory, but also in many situations that arise in anabelian geometry. On the other hand, in the context of inter-universal Teichmüller theory, the price, so to speak, that one pays for the very convenient, “general nonsense” nature of Kummer theory lies in the highly nontrivial nature which may be seen, for instance, in the establishment of various multiradiality properties of the cyclotomic rigidity algorithms that appear in inter-universal Teichmüller theory [cf. the discussion of [IUTchIII], Remark 2.3.3]. Here, we recall that such cyclotomic rigidity algorithms which never appear in discussions of conventional arithmetic geometry in which the arithmetic holo- morphic structure is held fixed play a central role in inter-universal Teichmüller theory precisely because of the indeterminacies that arise as a consequence of the dismantling of the arithmetic holomorphic structure. Finally, in this context, it is of interest to recall that, although local class field theory is, in a certain limited sense, applied in inter-universal Teichmüller theory, i.e., in order to obtain cyclotomic rigidity algorithms for MLF-Galois pairs [cf. [IUTchII], Proposition 1.3, (ii)], it is only “of limited use” in the sense that the resulting cyclotomic rigidity algorithms are uniradial [i.e., fail to be multiradial cf. [IUTchIII], Figs. 2.1, 3.7, and the surrounding discussions]. (vii) The fundamental incompatibility i.e., except in very exceptional clas- sical cases involving exponential functions in the case of Q or modular and elliptic functions in the case of imaginary quadratic fields discussed in (vi) of class field theory with situations that involve the evaluation of special functions at special points is highly reminiscent of the original point of view of class field theory in the early twentieth century [cf. Kronecker’s Jugendtraum, Hilbert’s twelfth problem], i.e., to the effect that further development of class field theory should proceed precisely by extending the theory involving evaluation of special functions at special points that exists in these “exceptional classical cases” to the case of ar- bitrary number fields. This state of affairs is, in turn, highly reminiscent of the fact that the approach taken in the above discussion to “dissecting global class field the- ory” is the oldest/original approach to global class field theory, as well as of the fact that this original approach is the most well-suited to discussions of comparisons between the theory of [Falt] and inter-universal Teichmüller theory. This state of affairs is also highly reminiscent of the discussion in [Pano], §3, §4, of the numerous analogies between inter-universal Teichmüller theory and the classical [i.e., dating back to the nineteenth century!] theory surrounding Jacobi’s identity for the theta function on the upper half-plane and Gaussian distributions/integrals. Finally, this collection of observations, taken as a whole, may be summarized as follows: Many of the ideas that appear in inter-universal Teichmüller theory bear a much closer resemblance to the mathematics of the late nine- teenth and early twentieth centuries i.e., to the mathematics of Gauss, Jacobi, Kummer, Kronecker, Weber, Frobenius, Hilbert, and Teichmüller than to the mathematics of the mid- to late twen- tieth century. This close resemblance suggests strongly that, relative to INTER-UNIVERSAL TEICHMÜLLER THEORY IV 63 the mathematics of the late nineteenth and early twentieth centuries, the course of development of a substantial portion of the mathematics of the mid- to late twentieth century should not be regarded as “unique” or “inevitable”, but rather as being merely one possible choice among many viable and fruitful alternatives that existed a priori. Here, we note that although the use, in inter-universal Teichmüller theory, of Belyi maps, as well as of the p-adic anabelian geometry of the 1990’s [i.e., [pGC]], may at first glance look like an incidence of “exceptions” to the “rule” constituted by this point of view, these “exceptions” may be thought of as “proving the rule” in the sense that they are far from typical of the mathematics of the late twentieth century. Remark 2.3.4. Various aspects of the theory of the present series of papers are substantially reminiscent of the theory surrounding Bogomolov’s proof of the geometric version of the Szpiro Conjecture, as discussed in [ABKP], [Zh]. Put another way, these aspects of the theory of the present series of papers may be thought of as arithmetic analogues of the geometric theory surrounding Bo- gomolov’s proof. Alternatively, Bogomolov’s proof may be thought of as a sort of useful elementary guide, or blueprint [perhaps even a sort of Rosetta stone!], for understanding substantial portions of the theory of the present series of papers. The author would like to express his gratitude to Ivan Fesenko for bringing to his attention, via numerous discussions in person, e-mails, and skype conversations between December 2014 and January 2015, the possibility of the existence of such fascinating connections between Bogomolov’s proof and the theory of the present series of papers. We discuss these analogies in more detail in [BogIUT]. Remark 2.3.5. In [Par], a proof is given of the Mordell Conjecture for function fields over the complex numbers. Like the proof of Bogomolov discussed in Remark 2.3.4, Parshin’s proof involves metric estimates of “displacements” that arise from actions of elements of the [usual topological] fundamental group of the complex hyperbolic curve that serves as the base scheme of the given family of curves. In particular, we observe that one may pose the following question: Is it possible to apply some portion of the ideas of the inter-universal Teichmüller theory developed in the present series of papers to obtain a proof of the Mordell Conjecture over number fields without making use of Belyi maps as in the proof of Corollary 2.3 [i.e., the proof of [GenEll], Theorem 2.1]? This question was posed to the author by Felipe Voloch in an e-mail in September 2015. The answer to this question is, as far as the author can see at the time of writing, “no”. On the other hand, this question is interesting in the context of the discussion of Remarks 2.3.3 and 2.3.4 in that it serves to highlight various inter- esting aspects of inter-universal Teichmüller theory, as we explain in the following discussion. (i) First, we recall [cf., e.g., [Lang2], Chapter I, §1, §2, for more details] that the starting point of the theory of the Kobayashi distance on a [Kobayashi] hyperbolic complex manifold is the well-known Schwarz lemma of elementary complex analysis 64 SHINICHI MOCHIZUKI and its consequences for the geometry of holomorphic maps from the open unit disc D in the complex plane to an arbitrary complex manifold. In the following discussion, we shall refer to this geometry as the Schwarz-theoretic geometry of D. Perhaps the most fundamental difference between the proofs of Parshin and Bogomolov lies in the fact that (PB1) Whereas Parshin’s proof revolves around estimates of displacements aris- ing from actions of elements of the fundamental group on a certain two- dimensional complete [Kobayashi] hyperbolic complex manifold by means of the holomorphic geometry of the Kobayashi distance, i.e., in effect, the Schwarz-theoretic geometry of D, Bogomolov’s proof [cf. the re- view of Bogomolov’s proof given in [BogIUT]] revolves around estimates of displacements arising from actions of elements of the fundamental group on a one-dimensional real analytic manifold [i.e., a universal covering of a copy of the unit circle S 1 ] by means of the real analytic symplectic geometry of the upper half-plane. Here, it is already interesting to note that this fundamental gap, in the case of results over complex function fields, between the holomorphic geometry applied in Parshin’s proof of the Mordell Conjecture and the real analytic symplectic geometry applied in Bogomolov’s proof of the Szpiro Conjecture is highly reminiscent of the fundamental gap discussed in Remark 2.3.3, (iii), in the case of results over number fields, between the arithmetically holomorphic nature of the proof of the Mordell Conjecture given in [Falt] and the “arithmetically quasi-conformal” nature of the proof of the Szpiro Conjecture [cf. Corollary 2.3] via inter-universal Teichmüller theory given in the present series of papers. That is to say, Parshin’s proof is best understood not as a “weaker, or simplified, ver- sion of Bogomolov’s proof obtained by extracting certain portions of Bogo- molov’s proof ”, but rather as a proof that reflects a fundamentally quali- tatively different geometry i.e., holomorphic, as opposed to real an- alytic from Bogomolov’s proof. This point of view already suggests rather strongly, relative to the analogy between Bogomolov’s proof and inter-universal Teichmüller theory [cf. [BogIUT]] that it is unnatural/unrealistic to expect to obtain a new proof of the Mordell Conjecture over number fields by applying some portion of the ideas of the inter-universal Teichmüller theory. (ii) At a more technical level, the fundamental difference (PB1) discussed in (i) may be seen in the fact that (PB2) whereas Parshin’s proof involves numerous holomorphic maps from the open unit disc D into one- and two-dimensional complex manifolds [i.e., in essence, the universal coverings of the base space and total space of the family of curves under consideration], Bogomolov’s proof revolves around the real analytic symplectic geometry of a fixed copy of the open unit disc D [or, equivalently, the upper half-plane], i.e., in Bogomolov’s proof, one never considers holomorphic maps from D to itself which are not biholomorphic. The essentially arbitrary nature of these numerous holomorphic maps that appear in Parshin’s proof is reflected in the fact that INTER-UNIVERSAL TEICHMÜLLER THEORY IV (PB3) 65 Parshin’s proof is well-suited to proving a rough qualitative [i.e., “finiteness”] result for families of curves of arbitrary genus 2, whereas Bogomolov’s proof is well-suited to proving a much finer explicit inequal- ity, but only in the case of families of elliptic curves. Another technical aspect of the proofs of Parshin and Bogomolov that is closely related to both (PB2) and (PB3) is the fact that (PB4) whereas the estimation apparatus of Bogomolov’s proof depends in an essential way on special properties of particular types of elements such as unipotent elements or commutators of the fundamental group under consideration, the estimation apparatus of Parshin’s proof is uni- form for arbitrary [“sufficiently small”] elements of the fundamental group under consideration. (iii) Although, as discussed in (ii), it is difficult to see how Parshin’s proof could be “embedded” into [i.e., obtained as a “suitable portion of ”] Bogomolov’s proof, the Schwarz-theoretic geometry of D admits a “natural embedding” into [i.e., admits a natural analogy to a suitable portion of] inter-universal Teichmüller theory, namely, in the form of the theory of categories of localizations of the sort that appear in [GeoAnbd], §2; [AbsTopI], §4; [AbsTopII], §3. This theory of cate- gories of localizations culminates in the theory of Belyi cuspidalizations, which is discussed in [AbsTopII], §3, and applied to obtained the mono-anabelian recon- struction algorithms of [AbsTopIII], §1. Moreover, the analogy between such categories of localizations and the classical Schwarz-theoretic geometry of D [or, equivalently, the upper half-plane] is discussed in the Introduction to [GeoAnbd], as well as in [IUTchI], Remark 5.1.4. This theory of categories of localizations may be summarized roughly as follows: In the context of absolute anabelian geometry over number fields and their nonarchimedean localizations, Belyi maps play the role of the Schwarz-theoretic geometry of the open unit disc D, i.e., the role of realizing a sort of arithmetic version of analytic continuation. This point of view is also interesting from the point of view of the discussion of Remark 2.2.4, (iii), i.e., to the effect that [noncritical] Belyi maps play the role of realizing a sort of arithmetic version of analytic continuation in the proof of [GenEll], Theorem 2.1. That is to say, from the point of view of the question posed at the beginning of the present Remark 2.3.5: Even if, in the context of inter-universal Teichmüller theory, one attempts to search for an analogue of Parshin’s proof in the form of a “suitable portion” of the inter-universal Teichmüller theory developed in [IUTchI], [IUTchII], [IUTchIII] [i.e., even if one avoids consideration of the applica- tion of [noncritical] Belyi maps in the proof of Corollary 2.3 via [GenEll], Theorem 2.1], one is ultimately led i.e., from the point of view of con- sidering arithmetic analogues of the classical complex theory of analytic continuation and the Schwarz-theoretic geometry of the open unit disc D to the Belyi maps that appear in the Belyi cuspidalizations of [AbsTopII], §3; [AbsTopIII], §1. 66 SHINICHI MOCHIZUKI Put another way, it appears that any search in the realm of inter-universal Te- ichmüller theory either for some proof of the Mordell Conjecture [over number fields] or for some analogue of Parshin’s proof [of the Mordell Conjecture over com- plex function fields] appears to lead inevitably to some application of Belyi maps to realize some sort of arithmetic analogue of the classical complex theory of ana- lytic continuation and the Schwarz-theoretic geometry of the open unit disc D. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 67 Section 3: Inter-universal Formalism: the Language of Species In the present §3, we develop albeit from an extremely naive/non-expert point of view, relative to the theory of foundations! the language of species. Roughly speaking, a “species” is a “type of mathematical object”, such as a “group”, a “ring”, a “scheme”, etc. In some sense, this language may be thought of as an explicit description of certain tasks typically executed at an implicit, intuitive level by mathematicians [i.e., mathematicians who are not equipped with a detailed knowledge of the theory of foundations!] via a sort of “mental arithmetic” in the course of interpreting various mathematical arguments. In the context of the theory developed in the present series of papers, however, it is useful to describe these intuitive operations explicitly. In the following discussion, we shall work with various models consisting of “sets” and a relation “∈” of the standard ZFC axioms of axiomatic set theory [i.e., the nine axioms of Zermelo-Fraenkel, together with the axiom of choice cf., e.g., [Drk], Chapter 1, §3]. We shall refer to such models as ZFC-models. Recall that a (Grothendieck) universe V is a set satisfying the following axioms [cf. [McLn], p. 194]: (i) V is transitive, i.e., if y x, x V , then y V . (ii) The set of natural numbers N V . (iii) If x V , then the power set of x also belongs to V . (iv) If x V , then the union of all members of x also belongs to V . (v) If x V , y V , and f : x y is a surjection, then y V . We shall say that a set E is a V -set if E V . The various ZFC-models that we work with may be thought of as [but are not restricted to be!] the ZFC-models determined by various universes that are sets relative to some ambient ZFC-model which, in addition to the standard ax- ioms of ZFC set theory, satisfies the following existence axiom [attributed to the “Grothendieck school” cf. the discussion of [McLn], p. 193]: († G ) Given any set x, there exists a universe V such that x V . We shall refer to a ZFC-model that also satisfies this additional axiom of the Grothendieck school as a ZFCG-model. This existence axiom († G ) implies, in par- ticular, that: Given a set I and a collection of universes V i , where i I, indexed by I [i.e., a ‘function’ I  i → V i ], there exists a [larger] universe V such that V i V , for i I. Indeed, since the graph of the function I  i → V i is a set, it follows that {V i } i∈I is a set. Thus, it follows from the existence axiom († G ) that there exists a universe V such that {V i } i∈I V . Hence, by condition (i), we conclude that V i V , for all i I, as desired. Note that this means, in particular, that there exist infinite ascending chains of universes V 0 V 1 V 2 V 3 . . . V n . . . V 68 SHINICHI MOCHIZUKI where n ranges over the natural numbers. On the other hand, by the axiom of foundation, there do not exist infinite descending chains of universes V 0  V 1  V 2  V 3  . . .  V n  . . . where n ranges over the natural numbers. Although we shall not discuss in detail here the quite difficult issue of whether or not there actually exist ZFCG-models, we remark in passing that it may be possible to justify the stance of ignoring such issues in the context of the present series of papers at least from the point of view of establishing the validity of various “final results” that may be formulated in ZFC-models by invoking the work of Feferman [cf. [Ffmn]]. Precise statements concerning such issues, however, lie beyond the scope of the present paper [as well as of the level of expertise of the author!]. In the following discussion, we use the phrase “set-theoretic formula” as it is conventionally used in discussions of axiomatic set theory [cf., e.g., [Drk], Chapter 1, §2], with the following proviso: In the following discussion, it should be understood that every set-theoretic formula that appears is “absolute” in the sense that its validity for a collection of sets contained in some universe V relative to the model of set theory determined by V is equivalent, for any universe W such that V W , to its validity for the same collection of sets relative to the model of set theory determined by W [cf., e.g., [Drk], Chapter 3, Definition 4.2]. Definition 3.1. (i) A 0-species S 0 is a collection of conditions given by a set-theoretic formula Φ 0 (E) involving an ordered collection E = (E 1 , . . . , E n 0 ) of sets E 1 , . . . , E n 0 [which we think of as “indeterminates”], for some integer n 0 1; in this situation, we shall refer to E as a collection of species-data for S 0 . If S 0 is a 0-species given by a set-theoretic formula Φ 0 (E), then a 0-specimen of S 0 is a specific ordered collection of n 0 sets E = (E 1 , . . . , E n 0 ) in some specific ZFC-model that satisfies Φ 0 (E). If E is a 0-specimen of a 0-species S 0 , then we shall write E S 0 . If, moreover, it holds, in any ZFC-model, that the 0-specimens of S 0 form a set, then we shall refer to S 0 as 0-small. (ii) Let S 0 be a 0-species. Then a 1-species S 1 acting on S 0 is a collection of set-theoretic formulas Φ 1 , Φ 1◦1 satisfying the following conditions: (a) Φ 1 is a set-theoretic formula Φ 1 (E, E , F) involving two collections of species-data E, E for S 0 [i.e., the conditions Φ 0 (E), Φ 0 (E ) hold] and an ordered collection F = (F 1 , . . . , F n 1 ) of [“in- determinate”] sets F 1 , . . . , F n 1 , for some integer n 1 1; in this situation, we shall refer to (E, E , F) as a collection of species-data for S 1 and write INTER-UNIVERSAL TEICHMÜLLER THEORY IV 69 F : E E . If, in some ZFC-model, E, E S 0 , and F is a specific or- dered collection of n 1 sets that satisfies the condition Φ 1 (E, E , F ), then we shall refer to the data (E, E , F ) as a 1-specimen of S 1 and write (E, E , F ) S 1 ; alternatively, we shall denote a 1-specimen (E, E , F ) via the notation F : E E and refer to E (respectively, E ) as the domain (respectively, codomain) of F : E E . (b) Φ 1◦1 is a set-theoretic formula Φ 1◦1 (E, E , E , F, F , F ) involving three collections of species-data F : E E , F : E E , F : E E for S 1 [i.e., the conditions Φ 0 (E); Φ 0 (E ); Φ 0 (E ); Φ 1 (E, E , F); Φ 1 (E , E , F ); Φ 1 (E, E , F ) hold]; in this situation, we shall refer to F as a composite of F with F and write F = F F [which is, a priori, an abuse of notation, since there may exist many composites of F with F cf. (c) below]; we shall use similar terminology and notation for 1-specimens in specific ZFC-models. (c) Given a pair of 1-specimens F : E E , F : E E of S 1 in some ZFC-model, there exists a unique composite F : E E of F with F in the given ZFC-model. (d) Composition of 1-specimens F : E E , F : E E , F : E E of S 1 in a ZFC-model is associative. (e) For any 0-specimen E of S 0 in a ZFC-model, there exists a [necessarily unique] 1-specimen F : E E of S 1 [in the given ZFC-model] which we shall refer to as the identity 1-specimen id E of E such that for any 1-specimens F : E E, F : E E of S 1 [in the given ZFC-model] we have F F = F , F F = F . If, moreover, it holds, in any ZFC-model, that for any two 0-specimens E, E of S 0 , the 1-specimens F : E E of S 1 [i.e., the 1-specimens of S 1 with domain E and codomain E ] form a set, then we shall refer to S 1 as 1-small. (iii) A species S is defined to be a pair consisting of a 0-species S 0 and a 1- species S 1 acting on S 0 . Fix a species S = (S 0 , S 1 ). Let i {0, 1}. Then we shall refer to an i-specimen of S i as an i-specimen of S. We shall refer to a 0-specimen (respectively, 1-specimen) of S as a species-object (respectively, a species-morphism) of S. We shall say that S is i-small if S i is i-small. We shall refer to a species- morphism F : E E as a species-isomorphism if there exists a species-morphism F : E E such that the composites F ◦F , F ◦F are identity species-morphisms; in this situation, we shall say that E, E are species-isomorphic. [Thus, one veri- fies immediately that composites of species-isomorphisms are species-isomorphisms.] We shall refer to a species-isomorphism whose domain and codomain are equal as a species-automorphism. We shall refer to as model-free [cf. Remark 3.1.1 below] an i-specimen of S equipped with a description via a set-theoretic formula that is “independent of the ZFC-model in which it is given” in the sense that for any pair of universes V 1 , V 2 of some ZFC-model such that V 1 V 2 , the set-theoretic formula determines the same i-specimen of S, whether interpreted relative to the ZFC-model determined by V 1 or the ZFC-model determined by V 2 . 70 SHINICHI MOCHIZUKI (iv) We shall refer to as the category determined by S in a ZFC-model the category whose objects are the species-objects of S in the given ZFC-model and whose arrows are the species-morphisms of S in the given ZFC-model. [One verifies immediately that this description does indeed determine a category.] Remark 3.1.1. We observe that any of the familiar descriptions of N [cf., e.g., [Drk], Chapter 2, Definitions 2.3, 2.9], Z, Q, Q p , or R, for instance, yield species [all of whose species-morphisms are identity species-morphisms] each of which has a unique species-object in any given ZFC-model. Such species are not to be confused with such species as the species of “monoids isomorphic to N and monoid isomor- phisms”, which admits many species-objects [all of which are species-isomorphic] in any ZFC-model. On the other hand, the set-theoretic formula used, for instance, to define the former “species N” may be applied to define a “model-free species-object N” of the latter “species of monoids isomorphic to N”. Remark 3.1.2. (i) It is important to remember when working with species that the essence of a species lies not in the specific sets that occur as species- objects or species-morphisms of the species in various ZFC-models, but rather in the collection of rules, i.e., set-theoretic formulas, that gov- ern the construction of such sets in an unspecified, “indeterminate” ZFC- model. Put another way, the emphasis in the theory of species lies in the programs i.e., “software” that yield the desired output data, not on the output data itself. From this point of view, one way to describe the various set-theoretic formulas that constitute a species is as a “deterministic algorithm” [a term suggested to the author by Minhyong Kim] for constructing the sets to be considered. (ii) One interesting point of view that arose in discussions between the author and F. Kato is the following. The relationship between the classical approach to discussing mathematics relative to a fixed model of set theory an approach in which specific sets play a central role and the “species-theoretic” approach con- sidered here in which the rules, given by set-theoretic formulas for constructing the sets of interest [i.e., not specific sets themselves!], play a central role may be regarded as analogous to the relationship between classical approaches to alge- braic varieties in which specific sets of solutions of polynomial equations in an algebraically closed field play a central role and scheme theory in which the functor determined by a scheme, i.e., the polynomial equations, or “rules”, that de- termine solutions, as opposed to specific sets of solutions themselves, play a central role. That is to say, in summary: [fixed model of set theory approach : species-theoretic approach] ←→ [varieties : schemes] A similar analogy i.e., of the form [fixed model of set theory approach : species-theoretic approach] ←→ [groups of specific matrices : abstract groups] INTER-UNIVERSAL TEICHMÜLLER THEORY IV 71 may be made to the notion of an “abstract group”, as opposed to a “group of specific matrices”. That is to say, just as a “group of specific matrices may be thought of as a specific representation of an “abstract group”, the category of objects determined by a species in a specific ZFC-model may be thought of as a specific representation of an “abstract species”. (iii) If, in the context of the discussion of (i), (ii), one tries to form a sort of quotient, in which “programs” that yield the same sets as “output data” are identified, then one must contend with the resulting indeterminacy, i.e., working with programs is only well-defined up to internal modifications of the programs in question that does not affect the final output. This leads to somewhat intractable problems concerning the internal structure of such programs a topic that lies well beyond the scope of the present work. Remark 3.1.3. (i) Typically, in the discussion to follow, we shall not write out explicitly the various set-theoretic formulas involved in the definition of a species. Rather, it is to be understood that the set-theoretic formulas to be used are those arising from the conventional descriptions of the mathematical objects involved. When applying such conventional descriptions, however, it is important to check that they are well-defined and do not depend upon the use of arbitrary choices that are not describable via well-defined set-theoretic formulas. (ii) The fact that the data involved in a species is given by abstract set-theoretic formulas imparts a certain canonicality to the mathematical notion constituted by the species, a canonicality that is not shared, for instance, by mathematical objects whose construction depends on an invocation of the axiom of choice in some particular ZFC-model [cf. the discussion of (i) above]. Moreover, by furnishing a stock of such “canonical notions”, the theory of species allows one, in effect, to compute the extent of deviation of various “non-canonical objects” [i.e., whose construction depends upon the invocation of the axiom of choice!] from a sort of “canonical norm”. Remark 3.1.4. Note that because the data involved in a species is given by abstract set-theoretic formulas, the mathematical notion constituted by the species is immune to, i.e., unaffected by, extensions of the universe i.e., such as the ascending chain V 0 V 1 V 2 V 3 . . . V n . . . V that appears in the discussion preceding Definition 3.1 in which one works. This is the sense in which we apply the term “inter-universal”. That is to say, “inter-universal geometry” allows one to relate the “geometries” that occur in distinct universes. Remark 3.1.5. Similar remarks to the remarks made in Remarks 3.1.2, 3.1.3, and 3.1.4 concerning the significance of working with set-theoretic formulas may be made with regard to the notions of mutations, morphisms of mutations, mutation- histories, observables, and cores to be introduced in Definition 3.3 below. One fundamental example of a species is the following. 72 SHINICHI MOCHIZUKI Example 3.2. Categories. The notions of a [small] category and an isomor- phism class of [covariant] functors between two given [small] categories yield an example of a species. That is to say, at a set-theoretic level, one may think of a [small] category as, for instance, a set of arrows, together with a set of composition relations, that satisfies certain properties; one may think of a [covariant] functor between [small] categories as the set given by the graph of the map on arrows de- termined by the functor [which satisfies certain properties]; one may think of an isomorphism class of functors as a collection of such graphs, i.e., the graphs deter- mined by the functors in the isomorphism class, which satisfies certain properties. Then one has “dictionaries” 0-species 1-species ←→ ←→ the notion of a category the notion of an isomorphism class of functors at the level of notions and a 0-specimen a 1-specimen ←→ ←→ a particular [small] category a particular isomorphism class of functors at the level of specific mathematical objects in a specific ZFC-model. Moreover, one verifies easily that species-isomorphisms between 0-species correspond to isomor- phism classes of equivalences of categories in the usual sense. Remark 3.2.1. Note that in the case of Example 3.2, one could also define a notion of “2-species”, “2-specimens”, etc., via the notion of an “isomorphism of functors”, and then take the 1-species under consideration to be the notion of a functor [i.e., not an isomorphism class of functors]. Indeed, more generally, one could define a notion of “n-species” for arbitrary integers n 1. Since, however, this approach would only serve to add an unnecessary level of complexity to the theory, we choose here to take the approach of working with “functors considered up to isomorphism”. Definition 3.3. Let S = (S 0 , S 1 ); S = (S 0 , S 1 ) be species. (i) A mutation M : S  S is defined to be a collection of set-theoretic formulas Ψ 0 , Ψ 1 satisfying the following properties: (a) Ψ 0 is a set-theoretic formula Ψ 0 (E, E) involving a collection of species-data E for S 0 and a collection of species- data E for S 0 ; in this situation, we shall write M(E) for E. Moreover, if, in some ZFC-model, E S 0 , then we require that there exist a unique E S 0 such that Ψ 0 (E, E) holds; in this situation, we shall write M(E) for E. (b) Ψ 1 is a set-theoretic formula Ψ 1 (E, E , F, F) INTER-UNIVERSAL TEICHMÜLLER THEORY IV 73 involving a collection of species-data F : E E for S 1 and a collection of species-data F : E E for S 1 , where E = M(E), E = M(E ); in this situation, we shall write M(F) for F. Moreover, if, in some ZFC- model, (F : E E ) S 1 , then we require that there exist a unique (F : E E ) S 1 such that Ψ 0 (E, E , F, F ) holds; in this situation, we shall write M(F ) for F . Finally, we require that the assignment F → M(F ) be compatible with composites and map identity species-morphisms of S to identity species-morphisms of S. In particular, if one fixes a ZFC- model, then M determines a functor from the category determined by S in the given ZFC-model to the category determined by S in the given ZFC-model. There are evident notions of “composition of mutations” and “identity mutations”. (ii) Let M, M : S  S be mutations. Then a morphism of mutations Z : M M is defined to be a set-theoretic formula Ξ satisfying the following properties: (a) Ξ is a set-theoretic formula Ξ(E, F) involving a collection of species-data E for S 0 and a collection of species- data F : M(E) M (E) for S 1 ; in this situation, we shall write Z(E) for F. Moreover, if, in some ZFC-model, E S 0 , then we require that there exist a unique F S 1 such that Ξ(E, F ) holds; in this situation, we shall write Z(E) for F . (b) Suppose, in some ZFC-model, that F : E 1 E 2 is a species-morphism of S. Then one has an equality of composite species-morphisms M (F ) Z(E 1 ) = Z(E 2 ) M(F ) : M(E 1 ) M (E 2 ). In particular, if one fixes a ZFC-model, then a morphism of mutations M M determines a natural transformation between the functors determined by M, M in the ZFC- model cf. (i). There are evident notions of “composition of morphisms of mutations” and “identity morphisms of mutations”. If it holds that for every species-object E of S, Z(E) is a species-isomorphism, then we shall refer to Z as an isomorphism of mutations. In particular, one verifies immediately that Z is an isomorphism of mutations if and only if there exists a morphism of mutations Z : M M such that the composite morphisms of mutations Z Z : M M, Z Z : M M are the respective identity morphisms of the mutations M, M . (iii) Let M : S  S be a mutation. Then we shall say that M is a mutation- equivalence if there exists a mutation M : S  S, together with isomorphisms of mutations between the composites M M , M M and the respective iden- tity mutations. In this situation, we shall say that M, M are mutation-quasi- inverses to one another. Finally, we observe that, if we suppose further that S, S are 1-small, then for any two given species-objects in the domain species of a mutation-equivalence, the mutation-equivalence induces a bijection between the set of species-morphisms (respectively, species-isomorphisms) between the two 74 SHINICHI MOCHIZUKI given species-objects [of the domain species] and the set of species-morphisms (re- spectively, species-isomorphisms) between the two species-objects [of the codomain species] obtained by applying the mutation-equivalence to the two given species- objects. (iv) Let Γ be an oriented graph, i.e., a graph Γ, which we shall refer to as the underlying graph of Γ, equipped with the additional data of a total ordering, for each edge e of Γ, on the set [of cardinality 2] of branches of e [cf., e.g., [AbsTopIII], §0]. Then we define a mutation-history H = ( Γ, S , M ) [indexed by Γ] to be a collection of data as follows: (a) for each vertex v of Γ, a species S v ; (b) for each edge e of Γ, running from a vertex v 1 to a vertex v 2 , a mutation M e : S v 1  S v 2 . In this situation, we shall refer to the vertices, edges, and branches of Γ as vertices, edges, and branches of H. Thus, the notion of a “mutation-history” may be thought of as a species-theoretic version of the notion of a “diagram of categories” given in [AbsTopIII], Definition 3.5, (i). (v) Let H = ( Γ, S , M ) be a mutation-history; S a species. For simplicity, we assume that the underlying graph of Γ is simply connected. Then we shall refer to as a(n) [S-valued] covariant (respectively, contravariant) observable V of the mutation-history H a collection of data as follows: (a) for each vertex v of Γ, a mutation V v : S v S, which we shall refer to as the observation mutation at v; (b) for each edge e of Γ, running from a vertex v 1 to a vertex v 2 , a morphism of mutations V e : V v 1 V v 2 M e (respectively, V e : V v 2 M e V v 1 ). If V is a covariant observable such that all of the morphisms of mutations “V e are isomorphisms of mutations, then we shall refer to the covariant observable V as a core. Thus, one may think of a core C of a mutation-history as lying “under” the entire mutation-history in a “uniform fashion”. Also, we shall refer to the “property [of an observable] of being a core” as the “coricity” of the observable. Finally, we note that the notions of an “observable” and a “core” given here may be thought of as simplified, species-theoretic versions of the notions of “observable” and “core” given in [AbsTopIII], Definition 3.5, (iii). Remark 3.3.1. (i) One well-known consequence of the axiom of foundation of axiomatic set theory is the assertion that “∈-loops” a b c ... a can never occur in the set theory in which one works. On the other hand, there are many situations in mathematics in which one wishes to somehow “identify” mathematical objects that arise at higher levels of the ∈-structure of the set theory INTER-UNIVERSAL TEICHMÜLLER THEORY IV 75 under consideration with mathematical objects that arise at lower levels of this ∈-structure. In some sense, the notions of a “set” and of a “bijection of sets” allow one to achieve such “identifications”. That is to say, the mathematical objects at both higher and lower levels of the ∈-structure constitute examples of the same mathematical notion of a “set”, so that one may consider “bijections of sets” be- tween those sets without violating the axiom of foundation. In some sense, the notion of a species may be thought of as a natural extension of this observation. That is to say, the notion of a “species” allows one to consider, for instance, species- isomorphisms between species-objects that occur at different levels of the ∈-structure of the set theory under consideration i.e., roughly speaking, to “simulate ∈-loops” without violating the axiom of foundation. Moreover, typically the sorts of species-objects at different levels of the ∈-structure that one wishes to somehow have “identified” with one another occur as the result of executing the mutations that arise in some sort of mutation-history ...  S  S  S  ...  S  ... [where S = (S 0 , S 1 ); S = (S 0 , S 1 ); S = (S 0 , S 1 ) are species] e.g., the “output species-objects” of the “S” on the right that arise from applying various mutations to the “input species-objects” of the “S” on the left. (ii) In the context of constructing “loops” in a mutation-history as in the final display of (i), we observe that the simpler the structure of the species involved, the easier it is to construct “loops”. It is for this reason that species such as the species determined by the notion of a category [cf. Example 3.2] are easier to work with, from the point of view of constructing “loops”, than more complicated species such as the species determined by the notion of a scheme. This is one of the principal motivations for the “geometry of categories” of which “absolute anabelian geometry” is the special case that arises when the categories involved are Galois categories i.e., for the theory of representing scheme-theoretic geometries via categories [cf., e.g., the Introductions of [MnLg], [SemiAnbd], [Cusp], [FrdI]]. At a more concrete level, the utility of working with categories to reconstruct objects that occurred at earlier stages of some sort of “series of constructions” [cf. the mutation-history of the final display of (i)!] may be seen in the “reconstruction of the underlying scheme” in various situations throughout [MnLg] by applying the natural equivalence of categories of the final display of [MnLg], Definition 1.1, (iv), from a certain category constructed from a log scheme, as well as in the theory of “slim exponentiation” discussed in the Appendix to [FrdI]. (iii) Again in the context of mutation-histories such as the one given in the final display of (i), although one may, on certain occasions, wish to apply various mutations that fundamentally alter the structure of the mathematical objects in- volved and hence give rise to “output species-objects” of the “S” on the right that are related in a highly nontrivial fashion to the “input species-objects” of the “S” on the left, it is also of interest to consider 76 SHINICHI MOCHIZUKI “portions” of the various mathematical objects that occur that are left unaltered by the various mutations that one applies. This is precisely the reason for the introduction of the notion of a core of a mutation- history. One important consequence of the construction of various cores associated to a mutation-history is that often one may apply various cores associated to a mutation-history to describe, by means of non-coric observables, the portions of the various math- ematical objects that occur which are altered by the various mutations that one applies in terms of the unaltered portions, i.e., cores. Indeed, this point of view plays a central role in the theory of the present series of papers cf. the discussion of Remark 3.6.1, (ii), below. Remark 3.3.2. One somewhat naive point of view that constituted one of the original motivations for the author in the development of theory of the present series of papers is the following. In the classical theory of schemes, when considering local systems on a scheme, there is no reason to restrict oneself to considering local systems valued in, say, modules over a finite ring. If, moreover, there is no reason to make such a restriction, then one is naturally led to consider, for instance, local systems of schemes [cf., e.g., the theory of the “Galois mantle” in [pTeich]], or, indeed, local systems of more general collections of mathematical objects. One may then ask what happens if one tries to consider local systems on the schemes that occur as fibers of a local system of schemes. [More concretely, if X is, for instance, a connected scheme, then one may consider local systems X over X whose fibers are isomorphic to X; then one may repeat this process, by considering such local systems over each fiber of the local system X on X, etc.] In this way, one is eventually led to the consideration of “systems of nested local systems” i.e., a local system over a local system over a local system, etc. It is precisely this point of view that underlies the notion of “successive iteration of a given mutation-history”, relative to the terminology formulated in the present §3. If, moreover, one thinks of such “successive iterates of a given mutation-history” as being a sort of abstraction of the naive idea of a “system of nested local systems”, then the notion of a core may be thought of as a sort of mathematical object that is invariant with respect to the application of the operations that gave rise to the “system of nested local systems”. Example 3.4. Topological Spaces and Fundamental Groups. (i) One verifies easily that the notions of a topological space and a continuous map between topological spaces determine an example of a species S top . In a similar  X of a pathwise connected topological vein, the notions of a universal covering X  X, Y  Y space X and a continuous map between such universal coverings X  Y  , X Y ], considered up to [i.e., a pair of compatible continuous maps X composition with a deck transformation of the universal covering Y  Y , determine an example of a species S u-top . We leave to the reader the routine task of writing out the various set-theoretic formulas that define the species structures of S top , S u-top . Here, we note that at a set-theoretic level, the species-morphisms of S u-top INTER-UNIVERSAL TEICHMÜLLER THEORY IV 77 are collections of continuous maps [between two given universal coverings], any two of which differ from one another by composition with a deck transformation. (ii) One verifies easily that the notions of a group and an outer homomorphism between groups [i.e., a homomorphism considered up to composition with an inner automorphism of the codomain group] determine an example of a species S gp . We leave to the reader the routine task of writing out the various set-theoretic formulas that define the species structure of S gp . Here, we note that at a set-theoretic level, the species-morphisms of S gp are collections of homomorphisms [between two given groups], any two of which differ from one another by composition with an inner automorphism. (iii) Now one verifies easily that the assignment  X) ( X →  Aut( X/X)   X) is a species-object of S u-top , and Aut( X/X) denotes the group where ( X  of deck transformations of the universal covering X X determines a mutation S u-top  S gp . That is to say, the “fundamental group” may be thought of as a sort of mutation. Example 3.5. Absolute Anabelian Geometry. (i) Let S be a class of connected normal schemes that is closed under isomor- phism [of schemes]. Suppose that there exists a set E S of schemes describable by a set-theoretic formula with the property that every scheme of S is isomorphic to some scheme belonging to E S . Then just as in the case of universal coverings of topological spaces discussed in Example 3.4, (i), one verifies easily, by applying the set-theoretic formula describing E S , that the universal pro-finite étale cover-  X of schemes X belonging to S and isomorphisms of such coverings ings X considered up to composition with a deck transformation give rise to a species S S . (ii) Let G be a class of topological groups that is closed under isomorphism [of topological groups]. Suppose that there exists a set E G of topological groups describable by a set-theoretic formula with the property that every topological group of G is isomorphic to some topological group belonging to E G . Then just as in the case of abstract groups discussed in Example 3.4, (ii), one verifies easily, by applying the set-theoretic formula describing E G , that topological groups belonging to G and [bi-continuous] outer isomorphisms between such topological groups give rise to a species S G . (iii) Let S be as in (i). Then for an appropriate choice of G, by associating to a universal pro-finite étale covering the resulting group of deck transformations, one obtains a mutation Π : S S  S G [cf. Example 3.4, (iii)]. Then one way to define the notion that the schemes belonging to the class S are “[absolute] anabelian” is to require the specification of a mutation  S S A : S G 78 SHINICHI MOCHIZUKI which forms a mutation-quasi-inverse to Π. Here, we note that the existence of the bijections [i.e., “fully faithfulness”] discussed in Definition 3.3, (iii), is, in essence, the condition that is usually taken as the definition of “anabelian”. By contrast, the species-theoretic approach of the present discussion may be thought of as an explicit mathematical formulation of the algorithmic approach to [absolute] an- abelian geometry discussed in the Introduction to [AbsTopI]. (iv) The framework of [absolute] anabelian geometry [cf., e.g., the framework discussed above in (iii)] gives a good example of the importance of specifying pre- cisely what species one is working with in a given “series of constructions” [cf., e.g., the mutation-history of the final display of Remark 3.3.1, (i)]. That is to say, there is a quite substantial difference between working with a profinite group in its sole capacity as a profinite group and working with the same profinite group which may happen to arise as the étale fundamental group of some scheme! regarded as being equipped with various data that arise from the construc- tion of the profinite group as the étale fundamental group of some scheme. It is precisely this sort of issue that constituted one of the original motivations for the author in the development of the theory of species presented here. Example 3.6. The Étale Site and Frobenius. (i) Let p be a prime number. If S is a reduced scheme over F p , then denote by S the scheme with the same topological space as S, but whose structure sheaf is given by the subsheaf def O S (p) = (O S ) p O S (p) of p-th powers of sections of S. Thus, the natural inclusion O S (p) → O S induces a morphism Φ S : S S (p) . Moreover, “raising to the p-th power” determines a natural isomorphism α S : S (p) S such that the resulting composite α S Φ S : S S is the Frobenius morphism of S. Write S p-sch for the species of reduced quasi-compact schemes over F p and quasi-compact mor- phisms of schemes. Then consider the [small] category S ét i.e., “the small étale site of S” defined as follows: An object of S ét is a(n) [necessarily quasi-affine, by Zariski’s Main Theo- rem!] étale morphism of finite presentation T S equipped with a finite open cover {U i } i∈I of S, together with factorizations i T | U i A N U i U i for each i I i where I is a finite subset of the set of open subschemes of S; A N U i denotes a standard copy of affine N i -space over U i , for some integer N i 1; the INTER-UNIVERSAL TEICHMÜLLER THEORY IV 79 i “⊆” exhibits T | U i as a finitely presented subscheme of A N U i ; we observe that any étale morphism of finite presentation T S necessarily admits such auxiliary data parametrized by some index set I. A morphism of S ét from an object T 1 S to an object T 2 S [each of which is equipped with auxiliary data] is a(n) [necessarily étale of finite presentation] S-morphism T 1 T 2 . In particular, one may construct an assignment S → S ét that maps a species-object S of S p-sch to the [small] category S ét in such a way that the assignment S → S ét is contravariantly functorial with respect to species- morphisms S 1 S 2 of S p-sch , and, moreover, may be described via set-theoretic formulas. Thus, such an assignment determines an “étale site mutation” M ét : S p-sch  S cat where we write S cat for the species of categories and isomorphism classes of contravariant functors [i.e., a slightly modified form of the species considered in Example 3.2]. Another natural assignment in the present context is the assignment S → S pf which maps S to its perfection S pf , i.e., the scheme determined by taking the inverse limit of the inverse system . . . S S S obtained by considering iterates of the Frobenius morphism of S. Thus, by considering the final copy of “S” in this inverse system, one obtains a natural morphism β S : S pf S. Finally, one obtains a “perfection mutation” M pf : S p-sch  S p-sch by considering the set-theoretic formulas underlying the assignment S → S pf . (ii) Write F p-sch : S p-sch  S p-sch for the “Frobenius mutation” obtained by considering the set-theoretic formulas underlying the assignment S → S (p) . Thus, one may formulate the well-known “invariance of the étale site under Frobenius” [cf., e.g., [FK], Chapter I, Proposition 3.16] as the statement that the “étale site mutation” M ét exhibits S cat as a core i.e., an “invariant piece” of the “Frobenius mutation-history” . . .  S p-sch  S p-sch  S p-sch  S p-sch  . . . determined by the “Frobenius mutation” F p-sch . In this context, we observe that the “perfection mutation” M pf also yields a core i.e., another “invari- ant piece” of the Frobenius mutation-history. On the other hand, the nat- ural morphism Φ S : S S (p) may be interpreted as a covariant observable of this mutation-history whose observation mutations are the identity mutations id S p-sch : S p-sch  S p-sch . Since Φ S is not, in general, an isomorphism, it follows 80 SHINICHI MOCHIZUKI that this observable constitutes an example of an non-coric observable. Never- theless, the natural morphism β S : S pf S may be interpreted as a morphism of mutations M pf id S p-sch that serves to relate the non-coric observable just considered to the coric observable arising from M pf . (iii) One may also develop a version of (i), (ii) for log schemes; we leave the routine details to the interested reader. Here, we pause to mention that the theory of log schemes motivates the following “combinatorial monoid-theoretic” version of the non-coric observable on the Frobenius mutation-history of (ii). Write S mon for the species of torsion-free abelian monoids and morphisms of monoids. If M def is a species-object of S mon , then write M (p) = p · M M . Then the assignment M → M (p) determines a “monoid-Frobenius mutation” F mon : S mon  S mon and hence a “monoid-Frobenius mutation-history” . . .  S mon  S mon  . . . which is equipped with a non-coric contravariant observable determined by the natural inclusion morphism M (p) → M and the observation mutations given by the identity mutations id S mon : S mon  S mon . On the other hand, the p-perfection M pf of M , i.e., the inductive limit of the inductive system M → M → M → . . . obtained by considering the inclusions given by multiplying by p, gives rise to a “monoid-p-perfection mutation” M pf-mon : S mon  S mon which may be interpreted as a core of the monoid-Frobenius mutation-history. Finally, the natural inclusion of monoids M → M pf may be interpreted as a mor- phism of mutations id S mon M pf-mon that serves to relate the non-coric observable just considered to the coric observable arising from M pf-mon . Remark 3.6.1. (i) The various constructions of Example 3.6 may be thought of as providing, in the case of the phenomena of “invariance of the étale site under Frobenius” and “invariance of the perfection under Frobenius”, a “species-theoretic interpretation” i.e., via consideration of “coric” versus “non-coric” observables of the difference between “étale-type” and “Frobenius-type” structures [cf. the discussion of [FrdI], §I4]. This sort of approach via “combinatorial patterns” to expressing the difference between “étale-type” and “Frobenius-type” structures plays a central role in the theory of the present series of papers. Indeed, the INTER-UNIVERSAL TEICHMÜLLER THEORY IV 81 mutation-histories and cores considered in Example 3.6, (ii), (iii), may be thought of as the underlying motivating examples for the theory of both · the vertical lines, i.e., consisting of log-links, and ×μ ×μ · the horizontal lines, i.e., consisting of Θ ×μ -/Θ ×μ gau -/Θ LGP -/Θ lgp -links, of the log-theta-lattice [cf. [IUTchIII], Definitions 1.4, 3.8]. Finally, we recall that this approach to understanding the log-links may be seen in the introduction of the terminology of “observables” and “cores” in [AbsTopIII], Definition 3.5, (iii). (ii) Example 3.6 also provides a good example of the important theme [cf. the discussion of Remark 3.3.1, (iii)] of describing non-coric data in terms of coric data cf. the morphism β S : S pf S of Example 3.6, (ii); the natural inclusion M → M pf of Example 3.6, (iii). From the point of view of the vertical and hori- zontal lines of the log-theta-lattice [cf. the discussion of (i)], this theme may also be observed in the vertically coric log-shells that serve as a common receptacle for the various arrows of the log-Kummer correspondences of [IUTchIII], Theorem 3.11, (ii), as well as in the multiradial representations of [IUTchIII], Theorem 3.11, (i), which describe [certain aspects of] the arithmetic holomorphic structure on one vertical line of the log-theta-lattice in terms that may be understood rela- tive to an alien arithmetic holomorphic structure on another vertical line i.e., separated from the first vertical line by horizontal arrows of the log-theta-lattice [cf. [IUTchIII], Remark 3.11.1; [IUTchIII], Remark 3.12.2, (ii)]. Remark 3.6.2. (i) In the context of the theme of “coric descriptions of non-coric data” dis- cussed in Remark 3.6.1, (ii), it is of interest to observe the significance of the use of set-theoretic formulas [cf. the discussion of Remarks 3.1.2, 3.1.3, 3.1.4, 3.1.5] to realize such descriptions. That is to say, descriptions in terms of arbitrary choices that depend on a particular model of set theory [cf. Remark 3.1.3] do not allow one to calculate in terms that make sense in one universe the operations performed in an alien universe! This is precisely the sort of situation that one encoun- ters when one considers the vertical and horizontal arrows of the log-theta-lattice [cf. (ii) below], where distinct universes arise from the distinct scheme-theoretic basepoints on either side of such an arrow that correspond to distinct ring the- ories, i.e., ring theories that cannot be related to one another by means of a ring homomorphism cf. the discussion of Remark 3.6.3 below. Indeed, it was precisely the need to understand this sort of situation that led the author to develop the “inter-universal” version of Teichmüller theory exposed in the present series of papers. Finally, we observe that the algorithmic approach [i.e., as opposed to the “fully faithfulness/Grothendieck Conjecture-style approach” cf. Example 3.5, (iii)] to reconstruction issues via set-theoretic formulas plays an essential role in this con- text. That is to say, although different algorithms, or software, may yield the 82 SHINICHI MOCHIZUKI same output data, it is only by working with specific algorithms that one may understand the delicate inter-relations that exist between various components of the structures that occur as one performs various operations [i.e., the mutations of a mutation-history]. In the case of the theory developed in the present series of papers, one central example of this phenomenon is the cyclotomic rigidity isomorphisms that underlie the theory of Θ ×μ LGP -link compatibility discussed in [IUTchIII], Theorem 3.11, (iii), (c), (d) [cf. also [IUTchIII], Remarks 2.2.1, 2.3.2]. (ii) The algorithmic approach to reconstruction that is taken throughout the present series of papers, as well as, for instance, in [FrdI], [EtTh], and [AbsTopIII], was conceived by the author in the spirit of the species-theoretic formulation ex- posed in the present §3. Nevertheless, [cf. Remark 3.1.3, (i)] we shall not explicitly write out the various set-theoretic formulas involved in the various species, muta- tions, etc. that are implicit throughout the theory of the present series of papers. Rather, it is to be understood that the set-theoretic formulas to be used are those arising from the conventional descriptions that are given of the mathematical ob- jects involved. When applying such conventional descriptions, however, the reader is obliged to check that they are well-defined and do not depend upon the use of arbitrary choices that are not describable via well-defined set-theoretic formulas. (iii) The sharp contrast between · the canonicality imparted by descriptions via set-theoretic formulas in the context of extensions of the universe in which one works [cf. Remarks 3.1.3, 3.1.4] and · the situation that arises if one allows, in one’s descriptions, the various arbitrary choices arising from invocations of the axiom of choice may be understood somewhat explicitly if one attempts to “catalogue the various possibilities” corresponding to various possible choices that may occur in one’s de- scription. That is to say, such a “cataloguing operation” typically obligates one to work with “sets of very large cardinality”, many of which must be constructed by means of set-theoretic exponentiation [i.e., such as the operation of passing from a set E to the power set “2 E of all subsets of E]. Such a rapid outbreak of “unwieldy large sets” is reminiscent of the rapid growth, in the p-adic crystalline theory, of the p-adic valuations of the denominators that occur when one formally integrates an arbitrary connection, as opposed to a “canonical connection” of the sort that arises from a crystalline representation. In the p-adic theory, such “canonical connections” are typically related to “canonical liftings”, such as, for instance, those that occur in p-adic Teichmüller theory [cf. [pOrd], [pTeich]]. In this context, it is of interest to recall that the canonical liftings of p-adic Teichmüller theory may, under certain conditions, be thought of as liftings “of minimal com- plexity” in the sense that their Witt vector coordinates are given by polynomials of minimal degree cf. the computations of [Finot]. Remark 3.6.3. (i) In the context of Remark 3.6.2, it is useful to recall the fundamental reason for the need to pursue “inter-universality” in the present series of papers [cf. the discussion of [IUTchIII], Remark 1.2.4; [IUTchIII], Remark 1.4.2], namely, INTER-UNIVERSAL TEICHMÜLLER THEORY IV 83 since étale fundamental groups i.e., in essence, Galois groups are defined as certain automorphism groups of fields/rings, the definition of such a Galois group as a certain automorphism group of some ring struc- ture is fundamentally incompatible with the vertical and horizontal arrows of the log-theta-lattice [i.e., which do not arise from ring homo- morphisms]! In this respect, “transformations” such as the vertical and horizontal arrows of the log-theta-lattice differ, quite fundamentally, from “transformations” that are compatible with the ring structures on the domain and codomain, i.e., morphisms of rings/schemes, which tautologically give rise to functorial morphisms between the respective étale fundamental groups. Put another way, in the notation of [IUTchI], Definition 3.1, (e), (f) [which will be applied throughout the remainder of the present Remark 3.6.3], for, say, v V non , the only natural correspondence that may be described by means of set- theoretic formulas between the isomorphs of the local base field Ga- lois groups “G v on either side of a vertical or horizontal arrow of the log-theta-lattice is the correspondence constituted by an indeterminate isomorphism of topological groups. A similar statement may be made concerning the isomorphs of the geometric funda- def mental group Δ v = Ker(Π v  G v ) on either side of a vertical [but not horizontal! cf. the discussion of (ii) below] arrow of the log-theta-lattice that is to say, the only natural correspondence that may described by means of set- theoretic formulas between these isomorphs is the correspondence con- stituted by an indeterminate isomorphism of topological groups equipped with some outer action by the respective isomorph of “G v cf. the discussion of [IUTchIII], Remark 1.2.4. Here, again we recall from the discussion of Remark 3.6.2, (i), (ii), that it is only by working with such corre- spondences that may be described by means of set-theoretic formulas that one may obtain descriptions that allow one to calculate the operations performed in one universe from the point of view of an alien universe. (ii) One fundamental difference between the vertical and horizontal arrows of the log-theta-lattice is that whereas, for, say, v V bad , (V1) one identifies, up to isomorphism, the isomorphs of the full arithmetic fundamental group “Π v on either side of a vertical arrow, (H1) one distinguishes the “Δ v ’s” on either side of a horizontal arrow, i.e., one only identifies, up to isomorphism, the local base field Galois groups “G v on either side of a horizontal arrow. cf. the discussion of [IUTchIII], Remark 1.4.2. One way to understand the fundamental reason for this difference is as follows. (V2) In order to construct the log-link i.e., at a more concrete level, the power series that defines the p v -adic logarithm at v it is necessary to avail oneself of the local ring structures at v [cf. the discussion of [IUTchIII], Definition 1.1, (i), (ii)], which may only be reconstructed from 84 SHINICHI MOCHIZUKI the full “Π v [i.e., not from “G v stripped of its structure as a quotient of Π v cf. the discussion of [IUTchIII], Remark 1.4.1, (i); [IUTchIII], Remark 2.1.1, (ii); [AbsTopIII], §I3]. ×μ ×μ (H2) In order to construct the Θ ×μ gau -/Θ LGP -/Θ lgp -links i.e., at a more concrete level, the correspondence q →  q j 2  j=1,...,l  [cf. [IUTchII], Remark 4.11.1] it is necessary, in effect, to construct an “isomorphism” between a mathematical object [i.e., the theta values 2 “q j ”] that depends, in an essential way, on regarding the various “j” as distinct labels [which are constructed from “Δ v ”!] and a mathematical object [i.e., “q”] that is independent of these labels; it is then a tautol- ogy that such an “isomorphism” may only be achieved if the labels i.e., in essence, “Δ v on either side of the “isomorphism” are kept distinct from one another. Here, we observe in passing that the “apparently horizontal arrow-related” issue dis- cussed in (H2) of simultaneous realization of “label-dependent” and “label- free” mathematical objects is reminiscent of the vertical arrow portion of the bi- coricity theory of [IUTchIII], Theorem 1.5 cf. the discussion of [IUTchIII], Remark 1.5.1, (i), (ii); Step (vii) of the proof of [IUTchIII], Corollary 3.12. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 85 Bibliography [ABKP] J. Amorós, F. Bogomolov, L. Katzarkov, T. Pantev, Symplectic Lefshetz fibra- tion with arbitrary fundamental groups, J. Differential Geom. 54 (2000), pp. 489-545. [NerMod] S. Bosch, W. Lütkebohmert, M. Raynaud, Néron Models, Ergebnisse der Math- ematik und ihrer Grenzgebiete 21, Springer-Verlag (1990). [Drk] F. R. Drake, Set Theory: an Introduction to Large Cardinals, Studies in Logic and the Foundations of Mathematics 76, North-Holland (1974). [DmMn] H. Dym and H. P. McKean, Fourier Series and Integrals, Academic Press (1972). [Edw] H. M. Edwards, Riemann’s Zeta Function, Academic Press (1974). [Falt] G. Faltings, Endlichkeitssätze für Abelschen Varietäten über Zahlkörpern, In- vent. Math. 73 (1983), pp. 349-366. [Ffmn] S. Feferman, Set-theoretical Foundations of Category Theory, Reports of the Midwest Category Seminar III, Lecture Notes in Mathematics 106, Springer- Verlag (1969), pp. 201-247. [Finot] L. R. A. Finotti, Minimal degree liftings of hyperelliptic curves, J. Math. Sci. Univ. Tokyo 11 (2004), pp. 1-47. [FK] E. Freitag and R. Kiehl, Étale Cohomology and the Weil Conjecture, Springer Verlag (1988). [Harts] R. Hartshorne, Algebraic Geometry, Graduate Texts in Mathematics 52, Springer- Verlag (1977). [Ih] Y. Ihara, Comparison of some quotients of fundamental groups of algebraic curves over p-adic fields, Galois-Teichmüller Theory and Arithmetic Geometry, Adv. Stud. Pure Math. 63, Math. Soc. Japan (2012), pp. 221-250. [Kobl] N. Koblitz, p-adic Numbers, p-adic Analysis, and Zeta-Functions, Graduate Texts in Mathematics 58, Springer-Verlag (1984). [Lang1] S. Lang, Algebraic number theory, Second Edition, Graduate Texts in Mathe- matics 110, Springer-Verlag (1986). [Lang2] S. Lang, Introduction to complex hyperbolic spaces, Springer-Verlag (1987). [McLn] S. MacLane, One Universe as a Foundation for Category Theory, Reports of the Midwest Category Seminar III, Lecture Notes in Mathematics 106, Springer- Verlag (1969). [Mss] D. W. Masser, Note on a conjecture of Szpiro in Astérisque 183 (1990), pp. 19-23. [Milne] J. S. Milne, Abelian Varieties in Arithmetic Geometry, edited by G. Cornell and J. H. Silverman, Springer-Verlag (1986), pp. 103-150. [pOrd] S. Mochizuki, A Theory of Ordinary p-adic Curves, Publ. Res. Inst. Math. Sci. 32 (1996), pp. 957-1151. 86 SHINICHI MOCHIZUKI [pTeich] S. Mochizuki, Foundations of p-adic Teichmüller Theory, AMS/IP Studies in Advanced Mathematics 11, American Mathematical Society/International Press (1999). [InpTch] S. Mochizuki, An Introduction to p-adic Teichmüller Theory, Cohomologies p-adiques et applications arithmétiques I, Astérisque 278 (2002), pp. 1-49. [pGC] S. Mochizuki, The Local Pro-p Anabelian Geometry of Curves, Invent. Math. 138 (1999), pp. 319-423. [HASurI] S. Mochizuki, A Survey of the Hodge-Arakelov Theory of Elliptic Curves I, Arithmetic Fundamental Groups and Noncommutative Algebra, Proceedings of Symposia in Pure Mathematics 70, American Mathematical Society (2002), pp. 533-569. [HASurII] S. Mochizuki, A Survey of the Hodge-Arakelov Theory of Elliptic Curves II, Algebraic Geometry 2000, Azumino, Adv. Stud. Pure Math. 36, Math. Soc. Japan (2002), pp. 81-114. [CanLift] S. Mochizuki, The Absolute Anabelian Geometry of Canonical Curves, Kazuya Kato’s fiftieth birthday, Doc. Math. 2003, Extra Vol., pp. 609-640. [GeoAnbd] S. Mochizuki, The Geometry of Anabelioids, Publ. Res. Inst. Math. Sci. 40 (2004), pp. 819-881. [SemiAnbd] S. Mochizuki, Semi-graphs of Anabelioids, Publ. Res. Inst. Math. Sci. 42 (2006), pp. 221-322. [Cusp] S. Mochizuki, Absolute anabelian cuspidalizations of proper hyperbolic curves, J. Math. Kyoto Univ. 47 (2007), pp. 451-539. [FrdI] S. Mochizuki, The Geometry of Frobenioids I: The General Theory, Kyushu J. Math. 62 (2008), pp. 293-400. [EtTh] S. Mochizuki, The Étale Theta Function and its Frobenioid-theoretic Manifes- tations, Publ. Res. Inst. Math. Sci. 45 (2009), pp. 227-349. [GenEll] S. Mochizuki, Arithmetic Elliptic Curves in General Position, Math. J. Okayama Univ. 52 (2010), pp. 1-28. [AbsTopI] S. Mochizuki, Topics in Absolute Anabelian Geometry I: Generalities, J. Math. Sci. Univ. Tokyo 19 (2012), pp. 139-242. [AbsTopII] S. Mochizuki, Topics in Absolute Anabelian Geometry II: Decomposition Groups and Endomorphisms, J. Math. Sci. Univ. Tokyo 20 (2013), pp. 171-269. [AbsTopIII] S. Mochizuki, Topics in Absolute Anabelian Geometry III: Global Reconstruc- tion Algorithms, J. Math. Sci. Univ. Tokyo 22 (2015), pp. 939-1156. [IUTchI] S. Mochizuki, Inter-universal Teichmüller Theory I: Construction of Hodge Theaters, RIMS Preprint 1756 (August 2012), to appear in Publ. Res. Inst. Math. Sci. [IUTchII] S. Mochizuki, Inter-universal Teichmüller Theory II: Hodge-Arakelov-theoretic Evaluation, RIMS Preprint 1757 (August 2012), to appear in Publ. Res. Inst. Math. Sci. INTER-UNIVERSAL TEICHMÜLLER THEORY IV 87 [IUTchIII] S. Mochizuki, Inter-universal Teichmüller Theory III: Canonical Splittings of the Log-theta-lattice, RIMS Preprint 1758 (August 2012), to appear in Publ. Res. Inst. Math. Sci. [Pano] S. Mochizuki, A Panoramic Overview of Inter-universal Teichmüller Theory, Algebraic number theory and related topics 2012, RIMS Kōkyūroku Bessatsu B51, Res. Inst. Math. Sci. (RIMS), Kyoto (2014), pp. 301-345. [MnLg] S. Mochizuki, Monomorphisms in Categories of Log Schemes, Kodai Math. J. 38 (2015), pp. 365-429. [BogIUT] S. Mochizuki, Bogomolov’s proof of the geometric version of the Szpiro Con- jecture from the point of view of inter-universal Teichmüller theory, Res. Math. Sci. 3 (2016), 3:6. [Par] A. N. Parshin, Finiteness theorems and hyperbolic manifolds, The Grothen- dieck Festschrift, Vol. III, Progress in Mathematics 88, Birkhäuser (1990). [Silv] J. H. Silverman, The Arithmetic of Elliptic Curves, Graduate Texts in Math- ematics 106, Springer-Verlag (1986). [Arak] C. Soulé, D. Abramovich, J.-F. Burnol, J. Kramer, Lectures on Arakelov Geom- etry, Cambridge studies in advanced mathematics 33, Cambridge University Press (1994). [vFr] M. van Frankenhuijsen, About the ABC conjecture and an alternative, Number theory, analysis and geometry, Springer-Verlag (2012), pp. 169-180. [Vjt] P. Vojta, Diophantine approximations and value distribution theory, Lecture Notes in Mathematics 1239, Springer-Verlag (1987). [Zh] S. Zhang, Geometry of algebraic points, First International Congress of Chi- nese Mathematicians (Beijing, 1998), AMS/IP Stud. Adv. Math. 20 (2001), pp. 185-198. Updated versions of preprints are available at the following webpage: http://www.kurims.kyoto-u.ac.jp/~motizuki/papers-english.html